{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1>8. Text I/O</h1>\n",
    "\n",
    "<h2>10/20/2023</h2>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>8.0 Last Time...</h2>\n",
    "<ul>\n",
    "    <li>You can use functions within NumPy to find the shape, number of dimensions, and number of elements in any array.</li>\n",
    "    <li><b>numpy.where()</b> will let you find the locations of elements within an array that meet a specified criterion.</li>\n",
    "    <li>Some functions within NumPy allow you to manipulate arrays, such as reshaping, transposing, \"unraveling\", concatenation, and repeating individual array elements.</li>\n",
    "    <li>When you're in a situation with multiple nested loops, often you can instead perform all operations a lot more easily by using array syntax.</li>\n",
    "    <li>Mathematical operations act on arrays elementwise.</li>\n",
    "    <li>When testing within an array, <b>and</b> and <b>or</b> cannot be used; instead, use NumPy's built-in functions.</li>\n",
    "</ul>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>8.1 File Objects</h2>\n",
    "\n",
    "A file object is just a variable that represents the file within Python. The process of creating a file object is the same general idea as creating any variable: you create it by assignment.\n",
    "\n",
    "For a text file, you can create a file with the built-in <b>open()</b> statement. The first argument in <b>open</b> gives the filename, and the second sets the mod for the file:\n",
    "<ul>\n",
    "    <li><b>'r'</b>: sets the file to read-only.</li>\n",
    "    <li><b>'w'</b>: sets the file to writing mode.</li>\n",
    "    <li><b>'a'</b>: sets the file to append mode (you can only add new things to the end).</li>\n",
    "</ul>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "_io.TextIOWrapper"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Try opening the 'test.txt' file that you added to your server.\n",
    "data = open(\"../test.txt\", \"r\")\n",
    "type(data)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When you're done with a file, you can use the <b>close()</b> method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Close that file back up.\n",
    "data.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>8.2 Text Input/Output</h2>\n",
    "\n",
    "To read a line from a file into a variable, you can use the <b>readline()</b> method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "This is a test!\n",
      "\n",
      "Here's some information:\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# First, open the file.\n",
    "data = open(\"../test.txt\",\"r\")\n",
    "\n",
    "\n",
    "# Assign the first line of text to the variable aline.\n",
    "aline = data.readline()\n",
    "\n",
    "# Calling readline() multiple times in a row will print the next row.\n",
    "bline = data.readline()\n",
    "\n",
    "# Print those first two lines of text.\n",
    "print(aline)\n",
    "print(bline)\n",
    "\n",
    "# Close the file. (This is good practice!)\n",
    "data.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also write a loop to go through the whole file!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "This is a test!\n",
      "\n",
      "Here's some information:\n",
      "\n",
      "IMPORTANT THINGS TO KNOW\n",
      "\n",
      "Okay, that's all I got.\n"
     ]
    }
   ],
   "source": [
    "data = open(\"../test.txt\", \"r\")\n",
    "\n",
    "for i in data:\n",
    "    print(i)\n",
    "    \n",
    "data.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Okay, but that's fairly limiting; more often, you'll want to read the whole file and put each line into a list as an element; this can be done using <b>readlines()</b> (note the plural!)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['This is a test!\\n', \"Here's some information:\\n\", 'IMPORTANT THINGS TO KNOW\\n', \"Okay, that's all I got.\"]\n",
      "<class 'list'>\n"
     ]
    }
   ],
   "source": [
    "# Let's open the file again.\n",
    "data = open(\"../test.txt\", \"r\")\n",
    "\n",
    "\n",
    "# Save the file's contents to a list.\n",
    "contents = data.readlines()\n",
    "\n",
    "print(contents)\n",
    "print(type(contents))\n",
    "\n",
    "# Close that file!\n",
    "data.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that there's a newline (<b>\\n</b>) character at the end of each line (except the last one).\n",
    "\n",
    "To write to a file, you can use the <b>write()</b> method (obviously this doesn't work if a file is in read-only mode)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Let's open a file in writing mode.\n",
    "data = open(\"../test.txt\", \"w\")\n",
    "\n",
    "\n",
    "# Write a phrase to the file.\n",
    "data.write(\"hello world\")\n",
    "data.close()\n",
    "\n",
    "# i didnt run this, and dont ever run this. will overwrite file.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that this overwrites everything currently inside the file! To write multiple lines (in list format) to a file, use <b>writelines()</b>."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "ename": "AttributeError",
     "evalue": "'_io.TextIOWrapper' object has no attribute 'append'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
      "\u001b[1;32m/Users/nik/data/python/notebooks/Week 04 F.ipynb Cell 16\u001b[0m line \u001b[0;36m7\n\u001b[1;32m      <a href='vscode-notebook-cell:/Users/nik/data/python/notebooks/Week%2004%20F.ipynb#X21sZmlsZQ%3D%3D?line=4'>5</a>\u001b[0m data\u001b[39m.\u001b[39mwritelines(contents)\n\u001b[1;32m      <a href='vscode-notebook-cell:/Users/nik/data/python/notebooks/Week%2004%20F.ipynb#X21sZmlsZQ%3D%3D?line=5'>6</a>\u001b[0m data \u001b[39m=\u001b[39m \u001b[39mopen\u001b[39m(\u001b[39m\"\u001b[39m\u001b[39m../test.txt\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m\"\u001b[39m\u001b[39ma\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[0;32m----> <a href='vscode-notebook-cell:/Users/nik/data/python/notebooks/Week%2004%20F.ipynb#X21sZmlsZQ%3D%3D?line=6'>7</a>\u001b[0m data\u001b[39m.\u001b[39;49mappend(\u001b[39m\"\u001b[39m\u001b[39mpoopy booty butt balls\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[1;32m      <a href='vscode-notebook-cell:/Users/nik/data/python/notebooks/Week%2004%20F.ipynb#X21sZmlsZQ%3D%3D?line=7'>8</a>\u001b[0m data\u001b[39m.\u001b[39mclose()\n",
      "\u001b[0;31mAttributeError\u001b[0m: '_io.TextIOWrapper' object has no attribute 'append'"
     ]
    }
   ],
   "source": [
    "data = open(\"../test.txt\", \"w\")\n",
    "\n",
    "# Earlier in this notebook we saved the contents of our file to a variable 'contents'.\n",
    "\n",
    "data.writelines(contents)\n",
    "data = open(\"../test.txt\", \"a\")\n",
    "data.(\"poopy booty butt balls\")\n",
    "data.close()\n",
    "\n",
    "# ok whatever ill figure it out when i need to"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>8.3 Processing File Contents</h2>\n",
    "\n",
    "As you might imagine, the contents of files can be pretty unwieldy. Luckily, there are a lot of methods that will make data easier to read!\n",
    "\n",
    "Sometimes (as with .csv files) you'll want to take a string and break it into list using a particular separator. <b>split()</b> is a useful tool!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['3.4', '2.1', '-2.6']\n",
      "['3.4', '2.1', '-2.6']\n"
     ]
    }
   ],
   "source": [
    "# Let's create a single string that has three pieces of data in it.\n",
    "a = '3.4 2.1 -2.6'\n",
    "\n",
    "\n",
    "# The obvious choice for a separator is a space.\n",
    "print(a.split(\" \"))\n",
    "a = '3.4,2.1,-2.6'\n",
    "print(a.split(\",\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If everything we read from a file is a string, we're sometimes going to have to convert to integers or floats."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['3.4', '2.1', '-2.6']\n"
     ]
    }
   ],
   "source": [
    "# We'll need NumPy for this!\n",
    "import numpy as np\n",
    "\n",
    "\n",
    "# Let's look at a typical situation: we've grabbed some numbers from a csv file.\n",
    "a = '3.4,2.1,-2.6'\n",
    "a = a.split(\",\")\n",
    "\n",
    "# Note that these are still strings.\n",
    "print(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 3.4  2.1 -2.6]\n"
     ]
    }
   ],
   "source": [
    "# We can convert these to floats the way we did before!\n",
    "b = np.zeros(len(a))\n",
    "for i in range(len(a)):\n",
    "    b[i] = float(a[i])\n",
    "print(b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Alternatively, we can convert to an array and use the <b>astype()</b> function built-in there. <b>'d'</b> is a float (double-precision), <b>'l'</b> is an integer (long integer)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 3.4  2.1 -2.6]\n"
     ]
    }
   ],
   "source": [
    "bnum = np.array(a).astype(\"d\")\n",
    "print(bnum)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2>8.4 Take-Home Points</h2>\n",
    "<ul>\n",
    "    <li>The <b>open()</b> statement lets you open a file in read, write, or append mode.</li>\n",
    "    <li>Files should always be closed using the <b>close()</b> statement.</li>\n",
    "    <li>You can read a single line with <b>readline()</b>, and multiple lines with <b>readlines()</b>.</li>\n",
    "    <li>The <b>write()</b> method allows you to write a single line, and the <b>writelines()</b> method allows you to write multiple lines.</li>\n",
    "    <li><b>split()</b> lets you break strings based on defined separators.</li>\n",
    "</ul>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# remember to close files\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}