{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "

8. Text I/O

\n", "\n", "

10/20/2023

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

8.0 Last Time...

\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

8.1 File Objects

\n", "\n", "A file object is just a variable that represents the file within Python. The process of creating a file object is the same general idea as creating any variable: you create it by assignment.\n", "\n", "For a text file, you can create a file with the built-in open() statement. The first argument in open gives the filename, and the second sets the mod for the file:\n", "" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "_io.TextIOWrapper" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Try opening the 'test.txt' file that you added to your server.\n", "data = open(\"../test.txt\", \"r\")\n", "type(data)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When you're done with a file, you can use the close() method." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Close that file back up.\n", "data.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

8.2 Text Input/Output

\n", "\n", "To read a line from a file into a variable, you can use the readline() method." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "This is a test!\n", "\n", "Here's some information:\n", "\n" ] } ], "source": [ "# First, open the file.\n", "data = open(\"../test.txt\",\"r\")\n", "\n", "\n", "# Assign the first line of text to the variable aline.\n", "aline = data.readline()\n", "\n", "# Calling readline() multiple times in a row will print the next row.\n", "bline = data.readline()\n", "\n", "# Print those first two lines of text.\n", "print(aline)\n", "print(bline)\n", "\n", "# Close the file. (This is good practice!)\n", "data.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also write a loop to go through the whole file!" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "This is a test!\n", "\n", "Here's some information:\n", "\n", "IMPORTANT THINGS TO KNOW\n", "\n", "Okay, that's all I got.\n" ] } ], "source": [ "data = open(\"../test.txt\", \"r\")\n", "\n", "for i in data:\n", " print(i)\n", " \n", "data.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay, but that's fairly limiting; more often, you'll want to read the whole file and put each line into a list as an element; this can be done using readlines() (note the plural!)." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['This is a test!\\n', \"Here's some information:\\n\", 'IMPORTANT THINGS TO KNOW\\n', \"Okay, that's all I got.\"]\n", "\n" ] } ], "source": [ "# Let's open the file again.\n", "data = open(\"../test.txt\", \"r\")\n", "\n", "\n", "# Save the file's contents to a list.\n", "contents = data.readlines()\n", "\n", "print(contents)\n", "print(type(contents))\n", "\n", "# Close that file!\n", "data.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that there's a newline (\\n) character at the end of each line (except the last one).\n", "\n", "To write to a file, you can use the write() method (obviously this doesn't work if a file is in read-only mode)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let's open a file in writing mode.\n", "data = open(\"../test.txt\", \"w\")\n", "\n", "\n", "# Write a phrase to the file.\n", "data.write(\"hello world\")\n", "data.close()\n", "\n", "# i didnt run this, and dont ever run this. will overwrite file.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that this overwrites everything currently inside the file! To write multiple lines (in list format) to a file, use writelines()." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "ename": "AttributeError", "evalue": "'_io.TextIOWrapper' object has no attribute 'append'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m/Users/nik/data/python/notebooks/Week 04 F.ipynb Cell 16\u001b[0m line \u001b[0;36m7\n\u001b[1;32m 5\u001b[0m data\u001b[39m.\u001b[39mwritelines(contents)\n\u001b[1;32m 6\u001b[0m data \u001b[39m=\u001b[39m \u001b[39mopen\u001b[39m(\u001b[39m\"\u001b[39m\u001b[39m../test.txt\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m\"\u001b[39m\u001b[39ma\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[0;32m----> 7\u001b[0m data\u001b[39m.\u001b[39;49mappend(\u001b[39m\"\u001b[39m\u001b[39mpoopy booty butt balls\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[1;32m 8\u001b[0m data\u001b[39m.\u001b[39mclose()\n", "\u001b[0;31mAttributeError\u001b[0m: '_io.TextIOWrapper' object has no attribute 'append'" ] } ], "source": [ "data = open(\"../test.txt\", \"w\")\n", "\n", "# Earlier in this notebook we saved the contents of our file to a variable 'contents'.\n", "\n", "data.writelines(contents)\n", "data = open(\"../test.txt\", \"a\")\n", "data.(\"poopy booty butt balls\")\n", "data.close()\n", "\n", "# ok whatever ill figure it out when i need to" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

8.3 Processing File Contents

\n", "\n", "As you might imagine, the contents of files can be pretty unwieldy. Luckily, there are a lot of methods that will make data easier to read!\n", "\n", "Sometimes (as with .csv files) you'll want to take a string and break it into list using a particular separator. split() is a useful tool!" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['3.4', '2.1', '-2.6']\n", "['3.4', '2.1', '-2.6']\n" ] } ], "source": [ "# Let's create a single string that has three pieces of data in it.\n", "a = '3.4 2.1 -2.6'\n", "\n", "\n", "# The obvious choice for a separator is a space.\n", "print(a.split(\" \"))\n", "a = '3.4,2.1,-2.6'\n", "print(a.split(\",\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If everything we read from a file is a string, we're sometimes going to have to convert to integers or floats." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['3.4', '2.1', '-2.6']\n" ] } ], "source": [ "# We'll need NumPy for this!\n", "import numpy as np\n", "\n", "\n", "# Let's look at a typical situation: we've grabbed some numbers from a csv file.\n", "a = '3.4,2.1,-2.6'\n", "a = a.split(\",\")\n", "\n", "# Note that these are still strings.\n", "print(a)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 3.4 2.1 -2.6]\n" ] } ], "source": [ "# We can convert these to floats the way we did before!\n", "b = np.zeros(len(a))\n", "for i in range(len(a)):\n", " b[i] = float(a[i])\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, we can convert to an array and use the astype() function built-in there. 'd' is a float (double-precision), 'l' is an integer (long integer)." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 3.4 2.1 -2.6]\n" ] } ], "source": [ "bnum = np.array(a).astype(\"d\")\n", "print(bnum)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

8.4 Take-Home Points

\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# remember to close files\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" } }, "nbformat": 4, "nbformat_minor": 4 }