You can use functions within NumPy to find the shape, number of dimensions, and number of elements in any array.
\n",
"
numpy.where() will let you find the locations of elements within an array that meet a specified criterion.
\n",
"
Some functions within NumPy allow you to manipulate arrays, such as reshaping, transposing, \"unraveling\", concatenation, and repeating individual array elements.
\n",
"
When you're in a situation with multiple nested loops, often you can instead perform all operations a lot more easily by using array syntax.
\n",
"
Mathematical operations act on arrays elementwise.
\n",
"
When testing within an array, and and or cannot be used; instead, use NumPy's built-in functions.
\n",
"\n",
"A file object is just a variable that represents the file within Python. The process of creating a file object is the same general idea as creating any variable: you create it by assignment.\n",
"\n",
"For a text file, you can create a file with the built-in open() statement. The first argument in open gives the filename, and the second sets the mod for the file:\n",
"
\n",
"
'r': sets the file to read-only.
\n",
"
'w': sets the file to writing mode.
\n",
"
'a': sets the file to append mode (you can only add new things to the end).
\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"_io.TextIOWrapper"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Try opening the 'test.txt' file that you added to your server.\n",
"data = open(\"../test.txt\", \"r\")\n",
"type(data)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When you're done with a file, you can use the close() method."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# Close that file back up.\n",
"data.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
8.2 Text Input/Output
\n",
"\n",
"To read a line from a file into a variable, you can use the readline() method."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"This is a test!\n",
"\n",
"Here's some information:\n",
"\n"
]
}
],
"source": [
"# First, open the file.\n",
"data = open(\"../test.txt\",\"r\")\n",
"\n",
"\n",
"# Assign the first line of text to the variable aline.\n",
"aline = data.readline()\n",
"\n",
"# Calling readline() multiple times in a row will print the next row.\n",
"bline = data.readline()\n",
"\n",
"# Print those first two lines of text.\n",
"print(aline)\n",
"print(bline)\n",
"\n",
"# Close the file. (This is good practice!)\n",
"data.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also write a loop to go through the whole file!"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"This is a test!\n",
"\n",
"Here's some information:\n",
"\n",
"IMPORTANT THINGS TO KNOW\n",
"\n",
"Okay, that's all I got.\n"
]
}
],
"source": [
"data = open(\"../test.txt\", \"r\")\n",
"\n",
"for i in data:\n",
" print(i)\n",
" \n",
"data.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Okay, but that's fairly limiting; more often, you'll want to read the whole file and put each line into a list as an element; this can be done using readlines() (note the plural!)."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['This is a test!\\n', \"Here's some information:\\n\", 'IMPORTANT THINGS TO KNOW\\n', \"Okay, that's all I got.\"]\n",
"\n"
]
}
],
"source": [
"# Let's open the file again.\n",
"data = open(\"../test.txt\", \"r\")\n",
"\n",
"\n",
"# Save the file's contents to a list.\n",
"contents = data.readlines()\n",
"\n",
"print(contents)\n",
"print(type(contents))\n",
"\n",
"# Close that file!\n",
"data.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that there's a newline (\\n) character at the end of each line (except the last one).\n",
"\n",
"To write to a file, you can use the write() method (obviously this doesn't work if a file is in read-only mode)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Let's open a file in writing mode.\n",
"data = open(\"../test.txt\", \"w\")\n",
"\n",
"\n",
"# Write a phrase to the file.\n",
"data.write(\"hello world\")\n",
"data.close()\n",
"\n",
"# i didnt run this, and dont ever run this. will overwrite file.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that this overwrites everything currently inside the file! To write multiple lines (in list format) to a file, use writelines()."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"ename": "AttributeError",
"evalue": "'_io.TextIOWrapper' object has no attribute 'append'",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m/Users/nik/data/python/notebooks/Week 04 F.ipynb Cell 16\u001b[0m line \u001b[0;36m7\n\u001b[1;32m 5\u001b[0m data\u001b[39m.\u001b[39mwritelines(contents)\n\u001b[1;32m 6\u001b[0m data \u001b[39m=\u001b[39m \u001b[39mopen\u001b[39m(\u001b[39m\"\u001b[39m\u001b[39m../test.txt\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m\"\u001b[39m\u001b[39ma\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[0;32m----> 7\u001b[0m data\u001b[39m.\u001b[39;49mappend(\u001b[39m\"\u001b[39m\u001b[39mpoopy booty butt balls\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[1;32m 8\u001b[0m data\u001b[39m.\u001b[39mclose()\n",
"\u001b[0;31mAttributeError\u001b[0m: '_io.TextIOWrapper' object has no attribute 'append'"
]
}
],
"source": [
"data = open(\"../test.txt\", \"w\")\n",
"\n",
"# Earlier in this notebook we saved the contents of our file to a variable 'contents'.\n",
"\n",
"data.writelines(contents)\n",
"data = open(\"../test.txt\", \"a\")\n",
"data.(\"poopy booty butt balls\")\n",
"data.close()\n",
"\n",
"# ok whatever ill figure it out when i need to"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
8.3 Processing File Contents
\n",
"\n",
"As you might imagine, the contents of files can be pretty unwieldy. Luckily, there are a lot of methods that will make data easier to read!\n",
"\n",
"Sometimes (as with .csv files) you'll want to take a string and break it into list using a particular separator. split() is a useful tool!"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['3.4', '2.1', '-2.6']\n",
"['3.4', '2.1', '-2.6']\n"
]
}
],
"source": [
"# Let's create a single string that has three pieces of data in it.\n",
"a = '3.4 2.1 -2.6'\n",
"\n",
"\n",
"# The obvious choice for a separator is a space.\n",
"print(a.split(\" \"))\n",
"a = '3.4,2.1,-2.6'\n",
"print(a.split(\",\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If everything we read from a file is a string, we're sometimes going to have to convert to integers or floats."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['3.4', '2.1', '-2.6']\n"
]
}
],
"source": [
"# We'll need NumPy for this!\n",
"import numpy as np\n",
"\n",
"\n",
"# Let's look at a typical situation: we've grabbed some numbers from a csv file.\n",
"a = '3.4,2.1,-2.6'\n",
"a = a.split(\",\")\n",
"\n",
"# Note that these are still strings.\n",
"print(a)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[ 3.4 2.1 -2.6]\n"
]
}
],
"source": [
"# We can convert these to floats the way we did before!\n",
"b = np.zeros(len(a))\n",
"for i in range(len(a)):\n",
" b[i] = float(a[i])\n",
"print(b)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alternatively, we can convert to an array and use the astype() function built-in there. 'd' is a float (double-precision), 'l' is an integer (long integer)."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[ 3.4 2.1 -2.6]\n"
]
}
],
"source": [
"bnum = np.array(a).astype(\"d\")\n",
"print(bnum)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
8.4 Take-Home Points
\n",
"
\n",
"
The open() statement lets you open a file in read, write, or append mode.
\n",
"
Files should always be closed using the close() statement.
\n",
"
You can read a single line with readline(), and multiple lines with readlines().
\n",
"
The write() method allows you to write a single line, and the writelines() method allows you to write multiple lines.
\n",
"
split() lets you break strings based on defined separators.