421 lines
12 KiB
Plaintext
Executable File
421 lines
12 KiB
Plaintext
Executable File
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<h1>8. Text I/O</h1>\n",
|
|
"\n",
|
|
"<h2>10/20/2023</h2>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<h2>8.0 Last Time...</h2>\n",
|
|
"<ul>\n",
|
|
" <li>You can use functions within NumPy to find the shape, number of dimensions, and number of elements in any array.</li>\n",
|
|
" <li><b>numpy.where()</b> will let you find the locations of elements within an array that meet a specified criterion.</li>\n",
|
|
" <li>Some functions within NumPy allow you to manipulate arrays, such as reshaping, transposing, \"unraveling\", concatenation, and repeating individual array elements.</li>\n",
|
|
" <li>When you're in a situation with multiple nested loops, often you can instead perform all operations a lot more easily by using array syntax.</li>\n",
|
|
" <li>Mathematical operations act on arrays elementwise.</li>\n",
|
|
" <li>When testing within an array, <b>and</b> and <b>or</b> cannot be used; instead, use NumPy's built-in functions.</li>\n",
|
|
"</ul>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<h2>8.1 File Objects</h2>\n",
|
|
"\n",
|
|
"A file object is just a variable that represents the file within Python. The process of creating a file object is the same general idea as creating any variable: you create it by assignment.\n",
|
|
"\n",
|
|
"For a text file, you can create a file with the built-in <b>open()</b> statement. The first argument in <b>open</b> gives the filename, and the second sets the mod for the file:\n",
|
|
"<ul>\n",
|
|
" <li><b>'r'</b>: sets the file to read-only.</li>\n",
|
|
" <li><b>'w'</b>: sets the file to writing mode.</li>\n",
|
|
" <li><b>'a'</b>: sets the file to append mode (you can only add new things to the end).</li>\n",
|
|
"</ul>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"_io.TextIOWrapper"
|
|
]
|
|
},
|
|
"execution_count": 2,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"# Try opening the 'test.txt' file that you added to your server.\n",
|
|
"data = open(\"../test.txt\", \"r\")\n",
|
|
"type(data)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"When you're done with a file, you can use the <b>close()</b> method."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Close that file back up.\n",
|
|
"data.close()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<h2>8.2 Text Input/Output</h2>\n",
|
|
"\n",
|
|
"To read a line from a file into a variable, you can use the <b>readline()</b> method."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"This is a test!\n",
|
|
"\n",
|
|
"Here's some information:\n",
|
|
"\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# First, open the file.\n",
|
|
"data = open(\"../test.txt\",\"r\")\n",
|
|
"\n",
|
|
"\n",
|
|
"# Assign the first line of text to the variable aline.\n",
|
|
"aline = data.readline()\n",
|
|
"\n",
|
|
"# Calling readline() multiple times in a row will print the next row.\n",
|
|
"bline = data.readline()\n",
|
|
"\n",
|
|
"# Print those first two lines of text.\n",
|
|
"print(aline)\n",
|
|
"print(bline)\n",
|
|
"\n",
|
|
"# Close the file. (This is good practice!)\n",
|
|
"data.close()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"You can also write a loop to go through the whole file!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"This is a test!\n",
|
|
"\n",
|
|
"Here's some information:\n",
|
|
"\n",
|
|
"IMPORTANT THINGS TO KNOW\n",
|
|
"\n",
|
|
"Okay, that's all I got.\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"data = open(\"../test.txt\", \"r\")\n",
|
|
"\n",
|
|
"for i in data:\n",
|
|
" print(i)\n",
|
|
" \n",
|
|
"data.close()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Okay, but that's fairly limiting; more often, you'll want to read the whole file and put each line into a list as an element; this can be done using <b>readlines()</b> (note the plural!)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"['This is a test!\\n', \"Here's some information:\\n\", 'IMPORTANT THINGS TO KNOW\\n', \"Okay, that's all I got.\"]\n",
|
|
"<class 'list'>\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Let's open the file again.\n",
|
|
"data = open(\"../test.txt\", \"r\")\n",
|
|
"\n",
|
|
"\n",
|
|
"# Save the file's contents to a list.\n",
|
|
"contents = data.readlines()\n",
|
|
"\n",
|
|
"print(contents)\n",
|
|
"print(type(contents))\n",
|
|
"\n",
|
|
"# Close that file!\n",
|
|
"data.close()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Note that there's a newline (<b>\\n</b>) character at the end of each line (except the last one).\n",
|
|
"\n",
|
|
"To write to a file, you can use the <b>write()</b> method (obviously this doesn't work if a file is in read-only mode)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Let's open a file in writing mode.\n",
|
|
"data = open(\"../test.txt\", \"w\")\n",
|
|
"\n",
|
|
"\n",
|
|
"# Write a phrase to the file.\n",
|
|
"data.write(\"hello world\")\n",
|
|
"data.close()\n",
|
|
"\n",
|
|
"# i didnt run this, and dont ever run this. will overwrite file.\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Note that this overwrites everything currently inside the file! To write multiple lines (in list format) to a file, use <b>writelines()</b>."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "AttributeError",
|
|
"evalue": "'_io.TextIOWrapper' object has no attribute 'append'",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
|
"\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
|
|
"\u001b[1;32m/Users/nik/data/python/notebooks/Week 04 F.ipynb Cell 16\u001b[0m line \u001b[0;36m7\n\u001b[1;32m <a href='vscode-notebook-cell:/Users/nik/data/python/notebooks/Week%2004%20F.ipynb#X21sZmlsZQ%3D%3D?line=4'>5</a>\u001b[0m data\u001b[39m.\u001b[39mwritelines(contents)\n\u001b[1;32m <a href='vscode-notebook-cell:/Users/nik/data/python/notebooks/Week%2004%20F.ipynb#X21sZmlsZQ%3D%3D?line=5'>6</a>\u001b[0m data \u001b[39m=\u001b[39m \u001b[39mopen\u001b[39m(\u001b[39m\"\u001b[39m\u001b[39m../test.txt\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m\"\u001b[39m\u001b[39ma\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[0;32m----> <a href='vscode-notebook-cell:/Users/nik/data/python/notebooks/Week%2004%20F.ipynb#X21sZmlsZQ%3D%3D?line=6'>7</a>\u001b[0m data\u001b[39m.\u001b[39;49mappend(\u001b[39m\"\u001b[39m\u001b[39mpoopy booty butt balls\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[1;32m <a href='vscode-notebook-cell:/Users/nik/data/python/notebooks/Week%2004%20F.ipynb#X21sZmlsZQ%3D%3D?line=7'>8</a>\u001b[0m data\u001b[39m.\u001b[39mclose()\n",
|
|
"\u001b[0;31mAttributeError\u001b[0m: '_io.TextIOWrapper' object has no attribute 'append'"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"data = open(\"../test.txt\", \"w\")\n",
|
|
"\n",
|
|
"# Earlier in this notebook we saved the contents of our file to a variable 'contents'.\n",
|
|
"\n",
|
|
"data.writelines(contents)\n",
|
|
"data = open(\"../test.txt\", \"a\")\n",
|
|
"data.(\"poopy booty butt balls\")\n",
|
|
"data.close()\n",
|
|
"\n",
|
|
"# ok whatever ill figure it out when i need to"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<h2>8.3 Processing File Contents</h2>\n",
|
|
"\n",
|
|
"As you might imagine, the contents of files can be pretty unwieldy. Luckily, there are a lot of methods that will make data easier to read!\n",
|
|
"\n",
|
|
"Sometimes (as with .csv files) you'll want to take a string and break it into list using a particular separator. <b>split()</b> is a useful tool!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 19,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"['3.4', '2.1', '-2.6']\n",
|
|
"['3.4', '2.1', '-2.6']\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Let's create a single string that has three pieces of data in it.\n",
|
|
"a = '3.4 2.1 -2.6'\n",
|
|
"\n",
|
|
"\n",
|
|
"# The obvious choice for a separator is a space.\n",
|
|
"print(a.split(\" \"))\n",
|
|
"a = '3.4,2.1,-2.6'\n",
|
|
"print(a.split(\",\"))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"If everything we read from a file is a string, we're sometimes going to have to convert to integers or floats."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 23,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"['3.4', '2.1', '-2.6']\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# We'll need NumPy for this!\n",
|
|
"import numpy as np\n",
|
|
"\n",
|
|
"\n",
|
|
"# Let's look at a typical situation: we've grabbed some numbers from a csv file.\n",
|
|
"a = '3.4,2.1,-2.6'\n",
|
|
"a = a.split(\",\")\n",
|
|
"\n",
|
|
"# Note that these are still strings.\n",
|
|
"print(a)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 25,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"[ 3.4 2.1 -2.6]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# We can convert these to floats the way we did before!\n",
|
|
"b = np.zeros(len(a))\n",
|
|
"for i in range(len(a)):\n",
|
|
" b[i] = float(a[i])\n",
|
|
"print(b)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Alternatively, we can convert to an array and use the <b>astype()</b> function built-in there. <b>'d'</b> is a float (double-precision), <b>'l'</b> is an integer (long integer)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 27,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"[ 3.4 2.1 -2.6]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"bnum = np.array(a).astype(\"d\")\n",
|
|
"print(bnum)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<h2>8.4 Take-Home Points</h2>\n",
|
|
"<ul>\n",
|
|
" <li>The <b>open()</b> statement lets you open a file in read, write, or append mode.</li>\n",
|
|
" <li>Files should always be closed using the <b>close()</b> statement.</li>\n",
|
|
" <li>You can read a single line with <b>readline()</b>, and multiple lines with <b>readlines()</b>.</li>\n",
|
|
" <li>The <b>write()</b> method allows you to write a single line, and the <b>writelines()</b> method allows you to write multiple lines.</li>\n",
|
|
" <li><b>split()</b> lets you break strings based on defined separators.</li>\n",
|
|
"</ul>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# remember to close files\n"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.11.4"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 4
|
|
}
|