{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Assignment 5 (Oct 19)\n",
    "\n",
    "Today we will learn to import data into Python, manipulate 2D tabular data (think spreadsheets) with `pandas`, and save our data to text files like CSV (comma-separated values).\n",
    "\n",
    "Outline:\n",
    "\n",
    "1. Intro to `pandas`\n",
    "2. Data I/O (reading and writing)\n",
    "3. Selecting data\n",
    "4. Manipulating data\n",
    "5. Plotting\n",
    "\n",
    "## Readings (optional)\n",
    "\n",
    "If you find this week's material new or challenging, you may want to read through some or all the following resources while working on your assignment:\n",
    "\n",
    "- [SPIRL Ch. 4.5 Pandas](https://cjtu.github.io/spirl/sp_pandas.html)\n",
    "- [Python for Data Science pandas tutorials](https://wavedatalab.github.io/datawithpython/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below we have our regular imports, but this time with `pandas` which most people alias to `pd`. The Pandas package is included with Anaconda, so you shouldn't need to install anything."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Intro to Pandas\n",
    "\n",
    "[Pandas](https://pandas.pydata.org/) is a package at the heart of most data analysis tasks with Python. It is quick at opening, selecting, manipulating and plotting 2D data with columns and rows (like tables and spreadsheets) which pandas calls **DataFrames**.\n",
    "\n",
    "Below we make a simple `pandas.DataFrame` to see what Pandas can do. One way to make a DataFrame is with two zipped lists, which we saw when making dictionaries in assignment 2."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[('apples', 142), ('oranges', 119), ('pears', 248), ('bananas', 84), ('kiwis', 41), ('cherries', 187)]\n"
     ]
    }
   ],
   "source": [
    "fruit = ['apples', 'oranges', 'pears', 'bananas', 'kiwis', 'cherries']\n",
    "count = [142, 119, 248, 84, 41, 187]\n",
    "inventory = list(zip(fruit, count))\n",
    "print(inventory)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To turn this into a Pandas **DataFrame**, we call `pd.DataFrame` and specify the `data` as our zipped lists and also the `columns` names in order."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "df is type:  <class 'pandas.core.frame.DataFrame'>\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Fruit</th>\n",
       "      <th>Count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>apples</td>\n",
       "      <td>142</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>oranges</td>\n",
       "      <td>119</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>pears</td>\n",
       "      <td>248</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>bananas</td>\n",
       "      <td>84</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>kiwis</td>\n",
       "      <td>41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>cherries</td>\n",
       "      <td>187</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      Fruit  Count\n",
       "0    apples    142\n",
       "1   oranges    119\n",
       "2     pears    248\n",
       "3   bananas     84\n",
       "4     kiwis     41\n",
       "5  cherries    187"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame(data=inventory, columns=['Fruit', 'Count'])\n",
    "print('df is type: ', type(df))\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we see we have a table version of our inventory data with named columns, \"Fruit\", and \"Count\". Let's see what we can learn about these data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Fruit</th>\n",
       "      <th>Count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>apples</td>\n",
       "      <td>142</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>oranges</td>\n",
       "      <td>119</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>pears</td>\n",
       "      <td>248</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>bananas</td>\n",
       "      <td>84</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>kiwis</td>\n",
       "      <td>41</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Fruit  Count\n",
       "0   apples    142\n",
       "1  oranges    119\n",
       "2    pears    248\n",
       "3  bananas     84\n",
       "4    kiwis     41"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()  # show first 5 rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Fruit</th>\n",
       "      <th>Count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>apples</td>\n",
       "      <td>142</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>oranges</td>\n",
       "      <td>119</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>pears</td>\n",
       "      <td>248</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>bananas</td>\n",
       "      <td>84</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>kiwis</td>\n",
       "      <td>41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>cherries</td>\n",
       "      <td>187</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      Fruit  Count\n",
       "0    apples    142\n",
       "1   oranges    119\n",
       "2     pears    248\n",
       "3   bananas     84\n",
       "4     kiwis     41\n",
       "5  cherries    187"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head(6)  # show first n rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "6"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(df)  # number of rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Index(['Fruit', 'Count'], dtype='object')\n",
      "['Fruit', 'Count']\n"
     ]
    }
   ],
   "source": [
    "print(df.columns)  # column names\n",
    "print(list(df.columns))  # as a normal list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Fruit    object\n",
       "Count     int64\n",
       "dtype: object"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.dtypes  # see data types for each column"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Each column in a DataFrame is assigned a type. We can change the type of a column with `.astype`. This is sometimes needed to ensure that you are working with numerical data, dates, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Fruit     object\n",
       "Count    float64\n",
       "dtype: object"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = df.astype({\"Fruit\": str, \"Count\": float})\n",
    "df.dtypes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A quirk of pandas is that it doesn't recognize the `str` type. All we need to do is spell out `\"string\"` in this case."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Fruit     string\n",
       "Count    float64\n",
       "dtype: object"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = df.astype({\"Fruit\": \"string\", \"Count\": float})\n",
    "df.dtypes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data I/O: Importing a CSV\n",
    "\n",
    "Data I/O (in/out) operations are an essential and often tedious part of analyzing scientific data. Pandas offers some help and can read simple structured data files with little trouble. These are mainly through the `pd.read_` functions listed [here](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html), some of which include:\n",
    "\n",
    "- CSV: `pd.read_csv()`\n",
    "- Fixed-width: `pd.read_fwf()`\n",
    "- Excel: `pd.read_excel()`\n",
    "- and more...\n",
    "\n",
    "Here we read in a **CSV** from a public URL. You can also read in CSVs on your local computer by passing in the path."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "      <th>car</th>\n",
       "      <th>Unnamed: 9</th>\n",
       "      <th>Unnamed: 10</th>\n",
       "      <th>Unnamed: 11</th>\n",
       "      <th>Unnamed: 12</th>\n",
       "      <th>Unnamed: 13</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>18.0</td>\n",
       "      <td>8</td>\n",
       "      <td>307</td>\n",
       "      <td>130</td>\n",
       "      <td>3504</td>\n",
       "      <td>12.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>chevrolet chevelle malibu</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>15.0</td>\n",
       "      <td>8</td>\n",
       "      <td>350</td>\n",
       "      <td>165</td>\n",
       "      <td>3693</td>\n",
       "      <td>11.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>buick skylark 320</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>18.0</td>\n",
       "      <td>8</td>\n",
       "      <td>318</td>\n",
       "      <td>150</td>\n",
       "      <td>3436</td>\n",
       "      <td>11.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>plymouth satellite</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>16.0</td>\n",
       "      <td>8</td>\n",
       "      <td>304</td>\n",
       "      <td>150</td>\n",
       "      <td>3433</td>\n",
       "      <td>12.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>amc rebel sst</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>17.0</td>\n",
       "      <td>8</td>\n",
       "      <td>302</td>\n",
       "      <td>140</td>\n",
       "      <td>3449</td>\n",
       "      <td>10.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>ford torino</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    mpg  cylinders  displacement horsepower  weight  acceleration  model  \\\n",
       "0  18.0          8           307        130    3504          12.0     70   \n",
       "1  15.0          8           350        165    3693          11.5     70   \n",
       "2  18.0          8           318        150    3436          11.0     70   \n",
       "3  16.0          8           304        150    3433          12.0     70   \n",
       "4  17.0          8           302        140    3449          10.5     70   \n",
       "\n",
       "   origin                        car  Unnamed: 9  Unnamed: 10  Unnamed: 11  \\\n",
       "0       1  chevrolet chevelle malibu         NaN          NaN          NaN   \n",
       "1       1          buick skylark 320         NaN          NaN          NaN   \n",
       "2       1         plymouth satellite         NaN          NaN          NaN   \n",
       "3       1              amc rebel sst         NaN          NaN          NaN   \n",
       "4       1                ford torino         NaN          NaN          NaN   \n",
       "\n",
       "   Unnamed: 12  Unnamed: 13  \n",
       "0          NaN          NaN  \n",
       "1          NaN          NaN  \n",
       "2          NaN          NaN  \n",
       "3          NaN          NaN  \n",
       "4          NaN          NaN  "
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "csv_url = \"https://raw.githubusercontent.com/guipsamora/pandas_exercises/master/05_Merge/Auto_MPG/cars1.csv\"\n",
    "df = pd.read_csv(csv_url)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It seems this CSV was imported with a bunch of empty columns. We can cut those off after importing, or adjust our `read_csv` to only include certain columns. This can be done by giving a list of indices to the `usecols` argument, but a range works just as well. Here we want the first 9 columns, meaning `range(0, 9)` in Python."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "      <th>car</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>18.0</td>\n",
       "      <td>8</td>\n",
       "      <td>307</td>\n",
       "      <td>130</td>\n",
       "      <td>3504</td>\n",
       "      <td>12.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>chevrolet chevelle malibu</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>15.0</td>\n",
       "      <td>8</td>\n",
       "      <td>350</td>\n",
       "      <td>165</td>\n",
       "      <td>3693</td>\n",
       "      <td>11.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>buick skylark 320</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>18.0</td>\n",
       "      <td>8</td>\n",
       "      <td>318</td>\n",
       "      <td>150</td>\n",
       "      <td>3436</td>\n",
       "      <td>11.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>plymouth satellite</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>16.0</td>\n",
       "      <td>8</td>\n",
       "      <td>304</td>\n",
       "      <td>150</td>\n",
       "      <td>3433</td>\n",
       "      <td>12.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>amc rebel sst</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>17.0</td>\n",
       "      <td>8</td>\n",
       "      <td>302</td>\n",
       "      <td>140</td>\n",
       "      <td>3449</td>\n",
       "      <td>10.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>ford torino</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    mpg  cylinders  displacement horsepower  weight  acceleration  model  \\\n",
       "0  18.0          8           307        130    3504          12.0     70   \n",
       "1  15.0          8           350        165    3693          11.5     70   \n",
       "2  18.0          8           318        150    3436          11.0     70   \n",
       "3  16.0          8           304        150    3433          12.0     70   \n",
       "4  17.0          8           302        140    3449          10.5     70   \n",
       "\n",
       "   origin                        car  \n",
       "0       1  chevrolet chevelle malibu  \n",
       "1       1          buick skylark 320  \n",
       "2       1         plymouth satellite  \n",
       "3       1              amc rebel sst  \n",
       "4       1                ford torino  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "csv_url = \"https://raw.githubusercontent.com/guipsamora/pandas_exercises/master/05_Merge/Auto_MPG/cars1.csv\"\n",
    "df = pd.read_csv(csv_url, usecols=range(0, 9))\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The pandas `read_csv` can also read in other character delimited file formats, e.g. space or tab-separated (TSV) files. All we need to do is change the delimiter. Ex.\n",
    "\n",
    "`df = pd.read_csv(f, delimiter='\\t')`  # Tab-separated values\n",
    "\n",
    "`df = pd.read_csv(f, delimiter='\\s+')`  # Whitespace separated"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## [Assignment 1] \n",
    "\n",
    "In this assignment, you will practice reading in several datasets of different formats.\n",
    "\n",
    "### 1a) CSV (comma-separated)\n",
    "\n",
    "We can read in CSV files like above with `pd.read_csv`. Try it on [a5data.csv](https://raw.githubusercontent.com/cjtu/spirl/master/spirl/data/a5data.csv) in the cell provided below (either go to the link and copy the url or right-click then copy link url to use in `pd.read_csv`).\n",
    "\n",
    "\n",
    "### 1b) TSV (tab-separated)\n",
    "\n",
    "To read in TSV files, we can also use `pd.read_csv` with the `delimiter='\\t'` argument. Try it on [a5data.tsv](https://raw.githubusercontent.com/cjtu/spirl/master/spirl/data/a5data.tsv) in the cell provided below.\n",
    "\n",
    "\n",
    "### 1c) Excel (.xls, .xlsx)\n",
    "\n",
    "To read in Excel files, we use `pd.read_excel`. Try it on [a5data.xlsx](https://raw.githubusercontent.com/cjtu/spirl/master/spirl/data/a5data.xlsx) in the cell provided below.\n",
    "\n",
    "\n",
    "### 1d) ASCII text file\n",
    "\n",
    "To read in text files with entries separated by some amount of whitespace (tabs, spaces, etc), we can use `pd.read_csv` with the `delimiter='\\s+'` argument. Try it on [a5data.txt](https://raw.githubusercontent.com/cjtu/spirl/master/spirl/data/a5data.txt) in the cell provided below.\n",
    "\n",
    "\n",
    "### 1e) ASCII text file with header rows\n",
    "\n",
    "To have pandas skip rows while importing, we can use the `skiprows` argument of `pd.read_csv`. Try it on [a5data_headers.txt](https://raw.githubusercontent.com/cjtu/spirl/master/spirl/data/a5data_headers.txt) in the space provided below.\n",
    "\n",
    "\n",
    "### 1f) BONUS: FITS table file\n",
    "\n",
    "(Optional): For this question, you will need to read in tabular data in a `.fits` file with `astropy.table.Table()` (note: this only works for FITS tables, not images). Once you've read in the table with astropy, convert it to a `pd.DataFrame` with the `.to_pandas()` method. You will need the following line to import astropy Table: \n",
    "\n",
    "`from astropy.table import Table`.\n",
    "\n",
    "Try it on [a5data.fits](https://raw.githubusercontent.com/cjtu/spirl/master/spirl/data/a5data.fits) in the cell provided below. See other file formats supported by astropy [here](https://docs.astropy.org/en/stable/io/unified.html#built-in-table-readers-writers).\n",
    "\n",
    "This data is from a paper by [Shih-Yun Tang et al. (2019)](https://ui.adsabs.harvard.edu/abs/2019ApJ...877...12T/abstract)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# [1a code below]\n",
    "\n",
    "print(df['Gmag'].max())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# [1b code below]\n",
    "\n",
    "print(df['Gmag'].max())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# [1c code below]\n",
    "\n",
    "print(df['Gmag'].max())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# [1d code below]\n",
    "\n",
    "print(df['Gmag'].max())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# [1e code below]\n",
    "\n",
    "print(df['Gmag'].max())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# [1f code below]\n",
    "from astropy.table import Table\n",
    "\n",
    "print(df['Gmag'].max())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Selecting data with Pandas\n",
    "\n",
    "Pandas offers several ways to access and filter data in DataFrames. The most basic are `[]`, `.loc[]` and `.iloc[]`.\n",
    "\n",
    "### Selecting with square brackets `[]` and dot `.`\n",
    "\n",
    "Selecting with square brackets `[]` gives us the column that we specify."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      chevrolet chevelle malibu\n",
       "1              buick skylark 320\n",
       "2             plymouth satellite\n",
       "3                  amc rebel sst\n",
       "4                    ford torino\n",
       "                 ...            \n",
       "193                ford maverick\n",
       "194                   amc hornet\n",
       "195           chevrolet chevette\n",
       "196              chevrolet woody\n",
       "197                    vw rabbit\n",
       "Name: car, Length: 198, dtype: object"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['car']  # Select a column by its name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Alternatively, we can access DataFrame columns with `.columnname`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      chevrolet chevelle malibu\n",
       "1              buick skylark 320\n",
       "2             plymouth satellite\n",
       "3                  amc rebel sst\n",
       "4                    ford torino\n",
       "                 ...            \n",
       "193                ford maverick\n",
       "194                   amc hornet\n",
       "195           chevrolet chevette\n",
       "196              chevrolet woody\n",
       "197                    vw rabbit\n",
       "Name: car, Length: 198, dtype: object"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.car"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There is no major difference between the two methods, it's a matter of personal preference. The dot (`.`) version is less typing, but is harder to use when you have your column name stored as a variable, for example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      18.0\n",
       "1      15.0\n",
       "2      18.0\n",
       "3      16.0\n",
       "4      17.0\n",
       "       ... \n",
       "193    24.0\n",
       "194    22.5\n",
       "195    29.0\n",
       "196    24.5\n",
       "197    29.0\n",
       "Name: mpg, Length: 198, dtype: float64"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "col = 'mpg'\n",
    "df[col]  # This works!\n",
    "# df.col  # This doesn't! AttributeError since \"col\" is not a column name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Selecting with `.loc[]`\n",
    "\n",
    "We can also select labelled columns or rows in our DataFrames with `.loc[]`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.loc['car']\n",
    "# KeyError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "While DataFrame `.loc[]` allows us to select by the row name and column name, we have to do it in the proper order `[rows, columns]`. To select the `car` columns, we can use the `:` as a placeholder meaning \"all rows\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      chevrolet chevelle malibu\n",
       "1              buick skylark 320\n",
       "2             plymouth satellite\n",
       "3                  amc rebel sst\n",
       "4                    ford torino\n",
       "                 ...            \n",
       "193                ford maverick\n",
       "194                   amc hornet\n",
       "195           chevrolet chevette\n",
       "196              chevrolet woody\n",
       "197                    vw rabbit\n",
       "Name: car, Length: 198, dtype: object"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[:, 'car']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So now we know `df['car']` is equivalent to `df.loc[:, 'car']`. We can also select named **rows** of data by their `Index`. Right now our rows are just numbered, but a DataFrame Index could have any unique label (usually `int` or `str`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "mpg                                  18.0\n",
       "cylinders                               8\n",
       "displacement                          307\n",
       "horsepower                            130\n",
       "weight                               3504\n",
       "acceleration                         12.0\n",
       "model                                  70\n",
       "origin                                  1\n",
       "car             chevrolet chevelle malibu\n",
       "Name: 0, dtype: object"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[0]  # First row of data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also select a single cell of data by specifying the row and column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"plymouth 'cuda 340\""
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[11, 'car']  # The car in row 11"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2130"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[29, 'weight']  # The weight in row 29"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also select multiple `Index` or `columns` by passing a list to `.loc[]`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "car           buick skylark 320\n",
       "horsepower                  165\n",
       "cylinders                     8\n",
       "Name: 1, dtype: object"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[1, ['car', 'horsepower', 'cylinders']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "12      chevrolet monte carlo\n",
       "13    buick estate wagon (sw)\n",
       "14      toyota corona mark ii\n",
       "15            plymouth duster\n",
       "16                 amc hornet\n",
       "Name: car, dtype: object"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[range(12, 17), 'car']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we select along one row or column, we get a 1D slice of our DataFrame. We can also select multiple indices and columns to get a smaller DataFrame. The result keeps the same order as the rows and columns we specify."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>car</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>horsepower</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>amc rebel sst</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>plymouth satellite</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>plymouth fury iii</td>\n",
       "      <td>8</td>\n",
       "      <td>215</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>buick skylark 320</td>\n",
       "      <td>8</td>\n",
       "      <td>165</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  car  cylinders horsepower\n",
       "3       amc rebel sst          8        150\n",
       "2  plymouth satellite          8        150\n",
       "7   plymouth fury iii          8        215\n",
       "1   buick skylark 320          8        165"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[[3, 2, 7, 1], ['car', 'cylinders', 'horsepower']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A common task is to make a subset of our DataFrame with only the columns we need. We can do this easily with `.loc[]`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>car</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>mpg</th>\n",
       "      <th>acceleration</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>chevrolet chevelle malibu</td>\n",
       "      <td>8</td>\n",
       "      <td>130</td>\n",
       "      <td>3504</td>\n",
       "      <td>18.0</td>\n",
       "      <td>12.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>buick skylark 320</td>\n",
       "      <td>8</td>\n",
       "      <td>165</td>\n",
       "      <td>3693</td>\n",
       "      <td>15.0</td>\n",
       "      <td>11.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>plymouth satellite</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "      <td>3436</td>\n",
       "      <td>18.0</td>\n",
       "      <td>11.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>amc rebel sst</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "      <td>3433</td>\n",
       "      <td>16.0</td>\n",
       "      <td>12.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>ford torino</td>\n",
       "      <td>8</td>\n",
       "      <td>140</td>\n",
       "      <td>3449</td>\n",
       "      <td>17.0</td>\n",
       "      <td>10.5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                         car  cylinders horsepower  weight   mpg  acceleration\n",
       "0  chevrolet chevelle malibu          8        130    3504  18.0          12.0\n",
       "1          buick skylark 320          8        165    3693  15.0          11.5\n",
       "2         plymouth satellite          8        150    3436  18.0          11.0\n",
       "3              amc rebel sst          8        150    3433  16.0          12.0\n",
       "4                ford torino          8        140    3449  17.0          10.5"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sub_df = df.loc[:, ['car', 'cylinders', 'horsepower', 'weight', 'mpg', 'acceleration']]\n",
    "sub_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One limitation of `.loc[]` is that we can't select a range of `str`, in this case our columns. We can choose a range of columns with `.iloc[]` instead. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sub_df = df.loc[:, range(4)]\n",
    "# KeyError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Selecting with `.iloc[]`\n",
    "\n",
    "With `.iloc[]`, we can select data by the numbered `Index` or `columns`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>18.0</td>\n",
       "      <td>8</td>\n",
       "      <td>307</td>\n",
       "      <td>130</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>15.0</td>\n",
       "      <td>8</td>\n",
       "      <td>350</td>\n",
       "      <td>165</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>18.0</td>\n",
       "      <td>8</td>\n",
       "      <td>318</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>16.0</td>\n",
       "      <td>8</td>\n",
       "      <td>304</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>17.0</td>\n",
       "      <td>8</td>\n",
       "      <td>302</td>\n",
       "      <td>140</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    mpg  cylinders  displacement horsepower\n",
       "0  18.0          8           307        130\n",
       "1  15.0          8           350        165\n",
       "2  18.0          8           318        150\n",
       "3  16.0          8           304        150\n",
       "4  17.0          8           302        140"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sub_df = df.iloc[:, range(4)]\n",
    "sub_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So in pandas, both the order of the rows and columns and their labels are important! We can select by labels using `.loc` and by integer index order using `.iloc`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>car</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>horsepower</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>amc rebel sst</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>plymouth satellite</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>plymouth fury iii</td>\n",
       "      <td>8</td>\n",
       "      <td>215</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>buick skylark 320</td>\n",
       "      <td>8</td>\n",
       "      <td>165</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  car  cylinders horsepower\n",
       "3       amc rebel sst          8        150\n",
       "2  plymouth satellite          8        150\n",
       "7   plymouth fury iii          8        215\n",
       "1   buick skylark 320          8        165"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = df.loc[[3, 2, 7, 1], ['car', 'cylinders', 'horsepower']]\n",
    "df2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      18.0\n",
       "1      15.0\n",
       "2      18.0\n",
       "3      16.0\n",
       "4      17.0\n",
       "       ... \n",
       "193    24.0\n",
       "194    22.5\n",
       "195    29.0\n",
       "196    24.5\n",
       "197    29.0\n",
       "Name: mpg, Length: 198, dtype: float64"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sub_df.iloc[:, 0]  # mpg was 1st column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3         amc rebel sst\n",
       "2    plymouth satellite\n",
       "7     plymouth fury iii\n",
       "1     buick skylark 320\n",
       "Name: car, dtype: object"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.iloc[:, 0]  # car is 1st column"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setting a new index in pandas\n",
    "\n",
    "We can choose a column of unique values to be our new `pandas` index. Here, let's use the car names."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>car</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>chevrolet chevelle malibu</th>\n",
       "      <td>18.0</td>\n",
       "      <td>8</td>\n",
       "      <td>307</td>\n",
       "      <td>130</td>\n",
       "      <td>3504</td>\n",
       "      <td>12.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>buick skylark 320</th>\n",
       "      <td>15.0</td>\n",
       "      <td>8</td>\n",
       "      <td>350</td>\n",
       "      <td>165</td>\n",
       "      <td>3693</td>\n",
       "      <td>11.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>plymouth satellite</th>\n",
       "      <td>18.0</td>\n",
       "      <td>8</td>\n",
       "      <td>318</td>\n",
       "      <td>150</td>\n",
       "      <td>3436</td>\n",
       "      <td>11.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>amc rebel sst</th>\n",
       "      <td>16.0</td>\n",
       "      <td>8</td>\n",
       "      <td>304</td>\n",
       "      <td>150</td>\n",
       "      <td>3433</td>\n",
       "      <td>12.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ford torino</th>\n",
       "      <td>17.0</td>\n",
       "      <td>8</td>\n",
       "      <td>302</td>\n",
       "      <td>140</td>\n",
       "      <td>3449</td>\n",
       "      <td>10.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                            mpg  cylinders  displacement horsepower  weight  \\\n",
       "car                                                                           \n",
       "chevrolet chevelle malibu  18.0          8           307        130    3504   \n",
       "buick skylark 320          15.0          8           350        165    3693   \n",
       "plymouth satellite         18.0          8           318        150    3436   \n",
       "amc rebel sst              16.0          8           304        150    3433   \n",
       "ford torino                17.0          8           302        140    3449   \n",
       "\n",
       "                           acceleration  model  origin  \n",
       "car                                                     \n",
       "chevrolet chevelle malibu          12.0     70       1  \n",
       "buick skylark 320                  11.5     70       1  \n",
       "plymouth satellite                 11.0     70       1  \n",
       "amc rebel sst                      12.0     70       1  \n",
       "ford torino                        10.5     70       1  "
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = df.set_index('car')\n",
    "df2.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can select certain rows by the car name using `.loc`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "mpg             16.0\n",
       "cylinders          8\n",
       "displacement     304\n",
       "horsepower       150\n",
       "weight          3433\n",
       "acceleration    12.0\n",
       "model             70\n",
       "origin             1\n",
       "Name: amc rebel sst, dtype: object"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.loc['amc rebel sst']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Be careful when working with a numerical `Index()`. You can run into confusing situations where the index associated with your rows is not in the same order as your rows. Recall the earlier example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>car</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>horsepower</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>amc rebel sst</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>plymouth satellite</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>plymouth fury iii</td>\n",
       "      <td>8</td>\n",
       "      <td>215</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>buick skylark 320</td>\n",
       "      <td>8</td>\n",
       "      <td>165</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  car  cylinders horsepower\n",
       "3       amc rebel sst          8        150\n",
       "2  plymouth satellite          8        150\n",
       "7   plymouth fury iii          8        215\n",
       "1   buick skylark 320          8        165"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3 = df.loc[[3, 2, 7, 1], ['car', 'cylinders', 'horsepower']]\n",
    "df3.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, our `Index` has the numbers `[3, 2, 7, 1]`, in that order as we can see with the `.index`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Int64Index([3, 2, 7, 1], dtype='int64')"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3.index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This means we can select any values in our index with `.loc`, or any *row* number from 0 to 3 with `.iloc`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "car           plymouth fury iii\n",
      "cylinders                     8\n",
      "horsepower                  215\n",
      "Name: 7, dtype: object\n",
      "car           plymouth fury iii\n",
      "cylinders                     8\n",
      "horsepower                  215\n",
      "Name: 7, dtype: object\n"
     ]
    }
   ],
   "source": [
    "print(df3.loc[7])  # Index 7\n",
    "print(df3.iloc[2])  # Row 3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But here we need to be careful because we can pick the 1st row with `.iloc[0]`, but there is no row with index `0` (remember our only indices are `[3, 2, 7, 1]`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "car           amc rebel sst\n",
      "cylinders                 8\n",
      "horsepower              150\n",
      "Name: 3, dtype: object\n"
     ]
    }
   ],
   "source": [
    "print(df3.iloc[0])\n",
    "# print(df3.loc[0])  # KeyError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If the numerical `Index` assigned by pandas doesn't matter and we don't want this situation where there are gaps or different orders of our rows, we can call `.reset_index(drop=True)`. This will renumber our DataFrame rows starting from 0."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>car</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>horsepower</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>amc rebel sst</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>plymouth satellite</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>plymouth fury iii</td>\n",
       "      <td>8</td>\n",
       "      <td>215</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>buick skylark 320</td>\n",
       "      <td>8</td>\n",
       "      <td>165</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  car  cylinders horsepower\n",
       "3       amc rebel sst          8        150\n",
       "2  plymouth satellite          8        150\n",
       "7   plymouth fury iii          8        215\n",
       "1   buick skylark 320          8        165"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df3.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>car</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>horsepower</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>amc rebel sst</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>plymouth satellite</td>\n",
       "      <td>8</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>plymouth fury iii</td>\n",
       "      <td>8</td>\n",
       "      <td>215</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>buick skylark 320</td>\n",
       "      <td>8</td>\n",
       "      <td>165</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  car  cylinders horsepower\n",
       "0       amc rebel sst          8        150\n",
       "1  plymouth satellite          8        150\n",
       "2   plymouth fury iii          8        215\n",
       "3   buick skylark 320          8        165"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df4 = df3.reset_index(drop=True)\n",
    "df4.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Summary: Selecting from DataFrames\n",
    "\n",
    "Keep in mind:\n",
    "\n",
    "- `['colname']` or `.colname` selects columns\n",
    "- `.loc[]` selects by a **label** (i.e., a value in `df.index` or `df.columns`)\n",
    "- `.iloc[]` selects by a numbered index in row / column order, starting at 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Adding a new column\n",
    "\n",
    "We can add columns to a DataFrame with `DataFrame['col'] = values`. We need to make sure the new values are the same number of rows as the DataFrame. An easy way to do this is to derive columns from existing DataFrame columns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "      <th>car</th>\n",
       "      <th>lp100km</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>18.0</td>\n",
       "      <td>8</td>\n",
       "      <td>307</td>\n",
       "      <td>130</td>\n",
       "      <td>3504</td>\n",
       "      <td>12.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>chevrolet chevelle malibu</td>\n",
       "      <td>13.07</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>15.0</td>\n",
       "      <td>8</td>\n",
       "      <td>350</td>\n",
       "      <td>165</td>\n",
       "      <td>3693</td>\n",
       "      <td>11.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>buick skylark 320</td>\n",
       "      <td>15.68</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>18.0</td>\n",
       "      <td>8</td>\n",
       "      <td>318</td>\n",
       "      <td>150</td>\n",
       "      <td>3436</td>\n",
       "      <td>11.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>plymouth satellite</td>\n",
       "      <td>13.07</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>16.0</td>\n",
       "      <td>8</td>\n",
       "      <td>304</td>\n",
       "      <td>150</td>\n",
       "      <td>3433</td>\n",
       "      <td>12.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>amc rebel sst</td>\n",
       "      <td>14.70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>17.0</td>\n",
       "      <td>8</td>\n",
       "      <td>302</td>\n",
       "      <td>140</td>\n",
       "      <td>3449</td>\n",
       "      <td>10.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>ford torino</td>\n",
       "      <td>13.84</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    mpg  cylinders  displacement horsepower  weight  acceleration  model  \\\n",
       "0  18.0          8           307        130    3504          12.0     70   \n",
       "1  15.0          8           350        165    3693          11.5     70   \n",
       "2  18.0          8           318        150    3436          11.0     70   \n",
       "3  16.0          8           304        150    3433          12.0     70   \n",
       "4  17.0          8           302        140    3449          10.5     70   \n",
       "\n",
       "   origin                        car  lp100km  \n",
       "0       1  chevrolet chevelle malibu    13.07  \n",
       "1       1          buick skylark 320    15.68  \n",
       "2       1         plymouth satellite    13.07  \n",
       "3       1              amc rebel sst    14.70  \n",
       "4       1                ford torino    13.84  "
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Convert miles per gallons to liters per 100 km\n",
    "df['lp100km'] = round(235.214583 / df['mpg'], 2)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Saving to a file\n",
    "\n",
    "It's nice to manipulate and plot our data in Python, but we often want to save the results of our efforts for later.\n",
    "\n",
    "The simplest way to do so is with pandas `.to_csv`. Pandas saves the `Index` by default, but since we just have default indices, you can just omit the index from the csv with `index=False`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [],
   "source": [
    "fname = '/full/path/to/cars.csv'  # <- change to real path\n",
    "# df.to_csv(fname, index=False)  # <- uncomment to save to file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Other useful DataFrame methods"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "      <th>car</th>\n",
       "      <th>lp100km</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>193</th>\n",
       "      <td>24.0</td>\n",
       "      <td>6</td>\n",
       "      <td>200</td>\n",
       "      <td>81</td>\n",
       "      <td>3012</td>\n",
       "      <td>17.6</td>\n",
       "      <td>76</td>\n",
       "      <td>1</td>\n",
       "      <td>ford maverick</td>\n",
       "      <td>9.80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>194</th>\n",
       "      <td>22.5</td>\n",
       "      <td>6</td>\n",
       "      <td>232</td>\n",
       "      <td>90</td>\n",
       "      <td>3085</td>\n",
       "      <td>17.6</td>\n",
       "      <td>76</td>\n",
       "      <td>1</td>\n",
       "      <td>amc hornet</td>\n",
       "      <td>10.45</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>195</th>\n",
       "      <td>29.0</td>\n",
       "      <td>4</td>\n",
       "      <td>85</td>\n",
       "      <td>52</td>\n",
       "      <td>2035</td>\n",
       "      <td>22.2</td>\n",
       "      <td>76</td>\n",
       "      <td>1</td>\n",
       "      <td>chevrolet chevette</td>\n",
       "      <td>8.11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>196</th>\n",
       "      <td>24.5</td>\n",
       "      <td>4</td>\n",
       "      <td>98</td>\n",
       "      <td>60</td>\n",
       "      <td>2164</td>\n",
       "      <td>22.1</td>\n",
       "      <td>76</td>\n",
       "      <td>1</td>\n",
       "      <td>chevrolet woody</td>\n",
       "      <td>9.60</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>197</th>\n",
       "      <td>29.0</td>\n",
       "      <td>4</td>\n",
       "      <td>90</td>\n",
       "      <td>70</td>\n",
       "      <td>1937</td>\n",
       "      <td>14.2</td>\n",
       "      <td>76</td>\n",
       "      <td>2</td>\n",
       "      <td>vw rabbit</td>\n",
       "      <td>8.11</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      mpg  cylinders  displacement horsepower  weight  acceleration  model  \\\n",
       "193  24.0          6           200         81    3012          17.6     76   \n",
       "194  22.5          6           232         90    3085          17.6     76   \n",
       "195  29.0          4            85         52    2035          22.2     76   \n",
       "196  24.5          4            98         60    2164          22.1     76   \n",
       "197  29.0          4            90         70    1937          14.2     76   \n",
       "\n",
       "     origin                 car  lp100km  \n",
       "193       1       ford maverick     9.80  \n",
       "194       1          amc hornet    10.45  \n",
       "195       1  chevrolet chevette     8.11  \n",
       "196       1     chevrolet woody     9.60  \n",
       "197       2           vw rabbit     8.11  "
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.tail()  # last 5 rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "      <th>lp100km</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>198.000000</td>\n",
       "      <td>198.000000</td>\n",
       "      <td>198.000000</td>\n",
       "      <td>198.000000</td>\n",
       "      <td>198.000000</td>\n",
       "      <td>198.000000</td>\n",
       "      <td>198.000000</td>\n",
       "      <td>198.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>19.719697</td>\n",
       "      <td>5.898990</td>\n",
       "      <td>223.469697</td>\n",
       "      <td>3177.888889</td>\n",
       "      <td>15.005556</td>\n",
       "      <td>72.818182</td>\n",
       "      <td>1.439394</td>\n",
       "      <td>13.029293</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>5.814254</td>\n",
       "      <td>1.785417</td>\n",
       "      <td>115.181017</td>\n",
       "      <td>934.783733</td>\n",
       "      <td>2.872382</td>\n",
       "      <td>1.865332</td>\n",
       "      <td>0.708085</td>\n",
       "      <td>3.924258</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>9.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>68.000000</td>\n",
       "      <td>1613.000000</td>\n",
       "      <td>8.000000</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>6.720000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>15.000000</td>\n",
       "      <td>4.000000</td>\n",
       "      <td>113.250000</td>\n",
       "      <td>2302.500000</td>\n",
       "      <td>13.000000</td>\n",
       "      <td>71.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>9.650000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>19.000000</td>\n",
       "      <td>6.000000</td>\n",
       "      <td>228.000000</td>\n",
       "      <td>3030.000000</td>\n",
       "      <td>15.000000</td>\n",
       "      <td>73.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>12.380000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>24.375000</td>\n",
       "      <td>8.000000</td>\n",
       "      <td>318.000000</td>\n",
       "      <td>4080.750000</td>\n",
       "      <td>16.800000</td>\n",
       "      <td>74.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>15.680000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>35.000000</td>\n",
       "      <td>8.000000</td>\n",
       "      <td>455.000000</td>\n",
       "      <td>5140.000000</td>\n",
       "      <td>23.500000</td>\n",
       "      <td>76.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>26.130000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              mpg   cylinders  displacement       weight  acceleration  \\\n",
       "count  198.000000  198.000000    198.000000   198.000000    198.000000   \n",
       "mean    19.719697    5.898990    223.469697  3177.888889     15.005556   \n",
       "std      5.814254    1.785417    115.181017   934.783733      2.872382   \n",
       "min      9.000000    3.000000     68.000000  1613.000000      8.000000   \n",
       "25%     15.000000    4.000000    113.250000  2302.500000     13.000000   \n",
       "50%     19.000000    6.000000    228.000000  3030.000000     15.000000   \n",
       "75%     24.375000    8.000000    318.000000  4080.750000     16.800000   \n",
       "max     35.000000    8.000000    455.000000  5140.000000     23.500000   \n",
       "\n",
       "            model      origin     lp100km  \n",
       "count  198.000000  198.000000  198.000000  \n",
       "mean    72.818182    1.439394   13.029293  \n",
       "std      1.865332    0.708085    3.924258  \n",
       "min     70.000000    1.000000    6.720000  \n",
       "25%     71.000000    1.000000    9.650000  \n",
       "50%     73.000000    1.000000   12.380000  \n",
       "75%     74.000000    2.000000   15.680000  \n",
       "max     76.000000    3.000000   26.130000  "
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.describe()  # summary statistics on each numerical column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Fruit</th>\n",
       "      <th>Count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>apples</td>\n",
       "      <td>142.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>apples</td>\n",
       "      <td>142.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>pears</td>\n",
       "      <td>248.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>bananas</td>\n",
       "      <td>84.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>kiwis</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Fruit  Count\n",
       "0   apples  142.0\n",
       "1   apples  142.0\n",
       "2    pears  248.0\n",
       "3  bananas   84.0\n",
       "4    kiwis    NaN"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fruit = ['apples', 'apples', 'pears', 'bananas', 'kiwis', 'cherries']\n",
    "count = [142, 142, 248, 84, None, 187]\n",
    "fruit_df = pd.DataFrame(zip(fruit, count), columns=['Fruit', 'Count'])\n",
    "fruit_df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Fruit</th>\n",
       "      <th>Count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>apples</td>\n",
       "      <td>142.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>apples</td>\n",
       "      <td>142.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>pears</td>\n",
       "      <td>248.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>bananas</td>\n",
       "      <td>84.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>kiwis</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Fruit  Count\n",
       "0   apples  142.0\n",
       "1   apples  142.0\n",
       "2    pears  248.0\n",
       "3  bananas   84.0\n",
       "4    kiwis    0.0"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fruit_df = fruit_df.fillna(0)  # fill missing values with 0\n",
    "fruit_df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Fruit</th>\n",
       "      <th>Count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>apples</td>\n",
       "      <td>142.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>pears</td>\n",
       "      <td>248.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>bananas</td>\n",
       "      <td>84.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>kiwis</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>cherries</td>\n",
       "      <td>187.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      Fruit  Count\n",
       "0    apples  142.0\n",
       "2     pears  248.0\n",
       "3   bananas   84.0\n",
       "4     kiwis    0.0\n",
       "5  cherries  187.0"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fruit_df = fruit_df.drop_duplicates()\n",
    "fruit_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Filtering DataFrames\n",
    "\n",
    "We often want to select data within a certain range or on a specific condition. There are a couple ways to do this. The simplest (but slowest and most error-prone) way is to use a loop.\n",
    "\n",
    "Say we want to select only cars with 4 cylinders. Here we use `df.iterrows()` which will return the `Index` of the row and the row itself on each iteration in the loop."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['toyota corona mark ii', 'datsun pl510', 'volkswagen 1131 deluxe sedan', 'peugeot 504', 'audi 100 ls']\n"
     ]
    }
   ],
   "source": [
    "cars_4cyl = []\n",
    "for i, row in df.iterrows():\n",
    "    if row['cylinders'] == 4:\n",
    "        cars_4cyl.append(row['car'])\n",
    "print(cars_4cyl[:5])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we have a list of 4 cylinder cars, but we've lost all other information stored about them in the DataFrame. We would need to make a separate list for each column we want to store. \n",
    "\n",
    "Instead of using a loop, we can do the **vectorized** solution which keeps our data in a DataFrame and selects info based on our condition (`row['cylinders'] == 4`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Max cylinders in DataFrame 4\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "      <th>car</th>\n",
       "      <th>lp100km</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>24.0</td>\n",
       "      <td>4</td>\n",
       "      <td>113</td>\n",
       "      <td>95</td>\n",
       "      <td>2372</td>\n",
       "      <td>15.0</td>\n",
       "      <td>70</td>\n",
       "      <td>3</td>\n",
       "      <td>toyota corona mark ii</td>\n",
       "      <td>9.80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>27.0</td>\n",
       "      <td>4</td>\n",
       "      <td>97</td>\n",
       "      <td>88</td>\n",
       "      <td>2130</td>\n",
       "      <td>14.5</td>\n",
       "      <td>70</td>\n",
       "      <td>3</td>\n",
       "      <td>datsun pl510</td>\n",
       "      <td>8.71</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>26.0</td>\n",
       "      <td>4</td>\n",
       "      <td>97</td>\n",
       "      <td>46</td>\n",
       "      <td>1835</td>\n",
       "      <td>20.5</td>\n",
       "      <td>70</td>\n",
       "      <td>2</td>\n",
       "      <td>volkswagen 1131 deluxe sedan</td>\n",
       "      <td>9.05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>25.0</td>\n",
       "      <td>4</td>\n",
       "      <td>110</td>\n",
       "      <td>87</td>\n",
       "      <td>2672</td>\n",
       "      <td>17.5</td>\n",
       "      <td>70</td>\n",
       "      <td>2</td>\n",
       "      <td>peugeot 504</td>\n",
       "      <td>9.41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>24.0</td>\n",
       "      <td>4</td>\n",
       "      <td>107</td>\n",
       "      <td>90</td>\n",
       "      <td>2430</td>\n",
       "      <td>14.5</td>\n",
       "      <td>70</td>\n",
       "      <td>2</td>\n",
       "      <td>audi 100 ls</td>\n",
       "      <td>9.80</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     mpg  cylinders  displacement horsepower  weight  acceleration  model  \\\n",
       "14  24.0          4           113         95    2372          15.0     70   \n",
       "18  27.0          4            97         88    2130          14.5     70   \n",
       "19  26.0          4            97         46    1835          20.5     70   \n",
       "20  25.0          4           110         87    2672          17.5     70   \n",
       "21  24.0          4           107         90    2430          14.5     70   \n",
       "\n",
       "    origin                           car  lp100km  \n",
       "14       3         toyota corona mark ii     9.80  \n",
       "18       3                  datsun pl510     8.71  \n",
       "19       2  volkswagen 1131 deluxe sedan     9.05  \n",
       "20       2                   peugeot 504     9.41  \n",
       "21       2                   audi 100 ls     9.80  "
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_4cyl = df[df.cylinders == 4]\n",
    "#  df_4cyl = df[df['cylinders'] == 4]  # Equivalent to the above\n",
    "print('Max cylinders in DataFrame', df_4cyl.cylinders.max())\n",
    "df_4cyl.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To break down what we are doing here, we can look at what `df.cylinders == 4` gives us."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      False\n",
       "1      False\n",
       "2      False\n",
       "3      False\n",
       "4      False\n",
       "       ...  \n",
       "193    False\n",
       "194    False\n",
       "195     True\n",
       "196     True\n",
       "197     True\n",
       "Name: cylinders, Length: 198, dtype: bool"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.cylinders == 4"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is a **boolean index**, meaning it will be `True` or `False` based on the condition we supplied (whether there are 4 cylinders or not). \n",
    "\n",
    "When we pass a boolean index to a DataFrame in square brackets (`[]`), it's similar to supplying specific labels or indices as we learned above. The only difference is it will select only the rows in the DataFrame that have `True`, e.g. rows where the car has 4 cylinders.\n",
    "\n",
    "We can supply conditions with any of the comparison operators (`==`, `<`, `<=`, `>`, `>=`). Say we want to find cars that get the fewest miles per gallon (mpg < 12):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "      <th>car</th>\n",
       "      <th>lp100km</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>10.0</td>\n",
       "      <td>8</td>\n",
       "      <td>360</td>\n",
       "      <td>215</td>\n",
       "      <td>4615</td>\n",
       "      <td>14.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>ford f250</td>\n",
       "      <td>23.52</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>10.0</td>\n",
       "      <td>8</td>\n",
       "      <td>307</td>\n",
       "      <td>200</td>\n",
       "      <td>4376</td>\n",
       "      <td>15.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>chevy c20</td>\n",
       "      <td>23.52</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>11.0</td>\n",
       "      <td>8</td>\n",
       "      <td>318</td>\n",
       "      <td>210</td>\n",
       "      <td>4382</td>\n",
       "      <td>13.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>dodge d200</td>\n",
       "      <td>21.38</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>9.0</td>\n",
       "      <td>8</td>\n",
       "      <td>304</td>\n",
       "      <td>193</td>\n",
       "      <td>4732</td>\n",
       "      <td>18.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>hi 1200d</td>\n",
       "      <td>26.13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>67</th>\n",
       "      <td>11.0</td>\n",
       "      <td>8</td>\n",
       "      <td>429</td>\n",
       "      <td>208</td>\n",
       "      <td>4633</td>\n",
       "      <td>11.0</td>\n",
       "      <td>72</td>\n",
       "      <td>1</td>\n",
       "      <td>mercury marquis</td>\n",
       "      <td>21.38</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>103</th>\n",
       "      <td>11.0</td>\n",
       "      <td>8</td>\n",
       "      <td>400</td>\n",
       "      <td>150</td>\n",
       "      <td>4997</td>\n",
       "      <td>14.0</td>\n",
       "      <td>73</td>\n",
       "      <td>1</td>\n",
       "      <td>chevrolet impala</td>\n",
       "      <td>21.38</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>124</th>\n",
       "      <td>11.0</td>\n",
       "      <td>8</td>\n",
       "      <td>350</td>\n",
       "      <td>180</td>\n",
       "      <td>3664</td>\n",
       "      <td>11.0</td>\n",
       "      <td>73</td>\n",
       "      <td>1</td>\n",
       "      <td>oldsmobile omega</td>\n",
       "      <td>21.38</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      mpg  cylinders  displacement horsepower  weight  acceleration  model  \\\n",
       "25   10.0          8           360        215    4615          14.0     70   \n",
       "26   10.0          8           307        200    4376          15.0     70   \n",
       "27   11.0          8           318        210    4382          13.5     70   \n",
       "28    9.0          8           304        193    4732          18.5     70   \n",
       "67   11.0          8           429        208    4633          11.0     72   \n",
       "103  11.0          8           400        150    4997          14.0     73   \n",
       "124  11.0          8           350        180    3664          11.0     73   \n",
       "\n",
       "     origin               car  lp100km  \n",
       "25        1         ford f250    23.52  \n",
       "26        1         chevy c20    23.52  \n",
       "27        1        dodge d200    21.38  \n",
       "28        1          hi 1200d    26.13  \n",
       "67        1   mercury marquis    21.38  \n",
       "103       1  chevrolet impala    21.38  \n",
       "124       1  oldsmobile omega    21.38  "
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df.mpg < 12]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also string conditions together by separating them with parentheses (`()`) and the **union** `&` or **intersection** `|` operators.\n",
    "\n",
    "- Union (`&`): select where condition1 AND condition2 are `True`\n",
    "- Intersection (`|`): select where condition1 OR condition2 are `True`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "      <th>car</th>\n",
       "      <th>lp100km</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>26.0</td>\n",
       "      <td>4</td>\n",
       "      <td>97</td>\n",
       "      <td>46</td>\n",
       "      <td>1835</td>\n",
       "      <td>20.5</td>\n",
       "      <td>70</td>\n",
       "      <td>2</td>\n",
       "      <td>volkswagen 1131 deluxe sedan</td>\n",
       "      <td>9.05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>56</th>\n",
       "      <td>26.0</td>\n",
       "      <td>4</td>\n",
       "      <td>91</td>\n",
       "      <td>70</td>\n",
       "      <td>1955</td>\n",
       "      <td>20.5</td>\n",
       "      <td>71</td>\n",
       "      <td>1</td>\n",
       "      <td>plymouth cricket</td>\n",
       "      <td>9.05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>102</th>\n",
       "      <td>26.0</td>\n",
       "      <td>4</td>\n",
       "      <td>97</td>\n",
       "      <td>46</td>\n",
       "      <td>1950</td>\n",
       "      <td>21.0</td>\n",
       "      <td>73</td>\n",
       "      <td>2</td>\n",
       "      <td>volkswagen super beetle</td>\n",
       "      <td>9.05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>131</th>\n",
       "      <td>32.0</td>\n",
       "      <td>4</td>\n",
       "      <td>71</td>\n",
       "      <td>65</td>\n",
       "      <td>1836</td>\n",
       "      <td>21.0</td>\n",
       "      <td>74</td>\n",
       "      <td>3</td>\n",
       "      <td>toyota corolla 1200</td>\n",
       "      <td>7.35</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      mpg  cylinders  displacement horsepower  weight  acceleration  model  \\\n",
       "19   26.0          4            97         46    1835          20.5     70   \n",
       "56   26.0          4            91         70    1955          20.5     71   \n",
       "102  26.0          4            97         46    1950          21.0     73   \n",
       "131  32.0          4            71         65    1836          21.0     74   \n",
       "\n",
       "     origin                           car  lp100km  \n",
       "19        2  volkswagen 1131 deluxe sedan     9.05  \n",
       "56        1              plymouth cricket     9.05  \n",
       "102       2       volkswagen super beetle     9.05  \n",
       "131       3           toyota corolla 1200     7.35  "
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "light_high_accel = df[(df.weight < 2000) & (df.acceleration > 20)]\n",
    "light_high_accel.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "      <th>car</th>\n",
       "      <th>lp100km</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>190</th>\n",
       "      <td>14.5</td>\n",
       "      <td>8</td>\n",
       "      <td>351</td>\n",
       "      <td>152</td>\n",
       "      <td>4215</td>\n",
       "      <td>12.8</td>\n",
       "      <td>76</td>\n",
       "      <td>1</td>\n",
       "      <td>ford gran torino</td>\n",
       "      <td>16.22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>191</th>\n",
       "      <td>22.0</td>\n",
       "      <td>6</td>\n",
       "      <td>225</td>\n",
       "      <td>100</td>\n",
       "      <td>3233</td>\n",
       "      <td>15.4</td>\n",
       "      <td>76</td>\n",
       "      <td>1</td>\n",
       "      <td>plymouth valiant</td>\n",
       "      <td>10.69</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>192</th>\n",
       "      <td>22.0</td>\n",
       "      <td>6</td>\n",
       "      <td>250</td>\n",
       "      <td>105</td>\n",
       "      <td>3353</td>\n",
       "      <td>14.5</td>\n",
       "      <td>76</td>\n",
       "      <td>1</td>\n",
       "      <td>chevrolet nova</td>\n",
       "      <td>10.69</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>193</th>\n",
       "      <td>24.0</td>\n",
       "      <td>6</td>\n",
       "      <td>200</td>\n",
       "      <td>81</td>\n",
       "      <td>3012</td>\n",
       "      <td>17.6</td>\n",
       "      <td>76</td>\n",
       "      <td>1</td>\n",
       "      <td>ford maverick</td>\n",
       "      <td>9.80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>194</th>\n",
       "      <td>22.5</td>\n",
       "      <td>6</td>\n",
       "      <td>232</td>\n",
       "      <td>90</td>\n",
       "      <td>3085</td>\n",
       "      <td>17.6</td>\n",
       "      <td>76</td>\n",
       "      <td>1</td>\n",
       "      <td>amc hornet</td>\n",
       "      <td>10.45</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      mpg  cylinders  displacement horsepower  weight  acceleration  model  \\\n",
       "190  14.5          8           351        152    4215          12.8     76   \n",
       "191  22.0          6           225        100    3233          15.4     76   \n",
       "192  22.0          6           250        105    3353          14.5     76   \n",
       "193  24.0          6           200         81    3012          17.6     76   \n",
       "194  22.5          6           232         90    3085          17.6     76   \n",
       "\n",
       "     origin               car  lp100km  \n",
       "190       1  ford gran torino    16.22  \n",
       "191       1  plymouth valiant    10.69  \n",
       "192       1    chevrolet nova    10.69  \n",
       "193       1     ford maverick     9.80  \n",
       "194       1        amc hornet    10.45  "
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "heavy_or_8cyl = df[(df.weight > 3000) | (df.cylinders == 8)]\n",
    "heavy_or_8cyl.tail()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we can take the opposite of a comparison with a preceding **negation** operator (`~`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "      <th>car</th>\n",
       "      <th>lp100km</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>17.0</td>\n",
       "      <td>8</td>\n",
       "      <td>302</td>\n",
       "      <td>140</td>\n",
       "      <td>3449</td>\n",
       "      <td>10.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>ford torino</td>\n",
       "      <td>13.84</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>15.0</td>\n",
       "      <td>8</td>\n",
       "      <td>429</td>\n",
       "      <td>198</td>\n",
       "      <td>4341</td>\n",
       "      <td>10.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>ford galaxie 500</td>\n",
       "      <td>15.68</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>21.0</td>\n",
       "      <td>6</td>\n",
       "      <td>200</td>\n",
       "      <td>85</td>\n",
       "      <td>2587</td>\n",
       "      <td>16.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>ford maverick</td>\n",
       "      <td>11.20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>10.0</td>\n",
       "      <td>8</td>\n",
       "      <td>360</td>\n",
       "      <td>215</td>\n",
       "      <td>4615</td>\n",
       "      <td>14.0</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>ford f250</td>\n",
       "      <td>23.52</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>25.0</td>\n",
       "      <td>4</td>\n",
       "      <td>98</td>\n",
       "      <td>?</td>\n",
       "      <td>2046</td>\n",
       "      <td>19.0</td>\n",
       "      <td>71</td>\n",
       "      <td>1</td>\n",
       "      <td>ford pinto</td>\n",
       "      <td>9.41</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     mpg  cylinders  displacement horsepower  weight  acceleration  model  \\\n",
       "4   17.0          8           302        140    3449          10.5     70   \n",
       "5   15.0          8           429        198    4341          10.0     70   \n",
       "17  21.0          6           200         85    2587          16.0     70   \n",
       "25  10.0          8           360        215    4615          14.0     70   \n",
       "32  25.0          4            98          ?    2046          19.0     71   \n",
       "\n",
       "    origin               car  lp100km  \n",
       "4        1       ford torino    13.84  \n",
       "5        1  ford galaxie 500    15.68  \n",
       "17       1     ford maverick    11.20  \n",
       "25       1         ford f250    23.52  \n",
       "32       1        ford pinto     9.41  "
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Use str.contains to match a sub-string\n",
    "ford_cars = df[df.car.str.contains('ford')]\n",
    "ford_cars.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mpg</th>\n",
       "      <th>cylinders</th>\n",
       "      <th>displacement</th>\n",
       "      <th>horsepower</th>\n",
       "      <th>weight</th>\n",
       "      <th>acceleration</th>\n",
       "      <th>model</th>\n",
       "      <th>origin</th>\n",
       "      <th>car</th>\n",
       "      <th>lp100km</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>22.0</td>\n",
       "      <td>6</td>\n",
       "      <td>198</td>\n",
       "      <td>95</td>\n",
       "      <td>2833</td>\n",
       "      <td>15.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>plymouth duster</td>\n",
       "      <td>10.69</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>18.0</td>\n",
       "      <td>6</td>\n",
       "      <td>199</td>\n",
       "      <td>97</td>\n",
       "      <td>2774</td>\n",
       "      <td>15.5</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>amc hornet</td>\n",
       "      <td>13.07</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>26.0</td>\n",
       "      <td>4</td>\n",
       "      <td>97</td>\n",
       "      <td>46</td>\n",
       "      <td>1835</td>\n",
       "      <td>20.5</td>\n",
       "      <td>70</td>\n",
       "      <td>2</td>\n",
       "      <td>volkswagen 1131 deluxe sedan</td>\n",
       "      <td>9.05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>25.0</td>\n",
       "      <td>4</td>\n",
       "      <td>110</td>\n",
       "      <td>87</td>\n",
       "      <td>2672</td>\n",
       "      <td>17.5</td>\n",
       "      <td>70</td>\n",
       "      <td>2</td>\n",
       "      <td>peugeot 504</td>\n",
       "      <td>9.41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>25.0</td>\n",
       "      <td>4</td>\n",
       "      <td>104</td>\n",
       "      <td>95</td>\n",
       "      <td>2375</td>\n",
       "      <td>17.5</td>\n",
       "      <td>70</td>\n",
       "      <td>2</td>\n",
       "      <td>saab 99e</td>\n",
       "      <td>9.41</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     mpg  cylinders  displacement horsepower  weight  acceleration  model  \\\n",
       "15  22.0          6           198         95    2833          15.5     70   \n",
       "16  18.0          6           199         97    2774          15.5     70   \n",
       "19  26.0          4            97         46    1835          20.5     70   \n",
       "20  25.0          4           110         87    2672          17.5     70   \n",
       "22  25.0          4           104         95    2375          17.5     70   \n",
       "\n",
       "    origin                           car  lp100km  \n",
       "15       1               plymouth duster    10.69  \n",
       "16       1                    amc hornet    13.07  \n",
       "19       2  volkswagen 1131 deluxe sedan     9.05  \n",
       "20       2                   peugeot 504     9.41  \n",
       "22       2                      saab 99e     9.41  "
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "high_accel_not_ford = df[(df.acceleration > 15) & ~df.car.str.contains('ford')]\n",
    "high_accel_not_ford.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plotting with DataFrame.plot\n",
    "\n",
    "Pandas gives you a convenient interface into plotting with the `.plot` method. It even labels your axes for you!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEGCAYAAACUzrmNAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAtcUlEQVR4nO3df5TcVZnn8ffTP+i0+QEhQCZJJ8Q1sHMIP9qlzQSjrOLsDOs4BBWyjIvBHWbicvAYBxTEP9TBZQdQ4ozODjsRHYiK2ICarAM6QMZV3JDYcTrBwLhmBJJOMiH0JiGNSae7+tk/6tukqlPfqu+trm/Xj/68zgldfftb1be6OPXUvc+9zzV3R0REpJimandARERqn4KFiIiUpGAhIiIlKViIiEhJChYiIlJSS7U7kJYzzjjDFy5cWO1uiIjUla1bt77i7meObW/YYLFw4UJ6enqq3Q0RkbpiZi8Vatc0lIiIlKRgISIiJSlYiIhISQoWIiJSkoKFiIiUpGDRQPoHBtm2+xD9A4PV7oqINJiGXTo72azv3cOtj26ntamJoZER7n7/hVzROa/a3RKRBqGRRQPoHxjk1ke3c2xohCODwxwbGuGWR7drhCEiFaNg0QD6Dh6ltSn/pWxtaqLv4NEq9UhEGo2CRQPomNnO0MhIXtvQyAgdM9ur1CMRaTQKFg1g1rQ27n7/hUxpbWJ6WwtTWpu4+/0XMmtaW7W7JiINQgnuBnFF5zyWLTqDvoNH6ZjZrkAhIhWlYNFAZk1rU5AQkVSkOg1lZi+a2bNm1mtmPVHb6Wb2hJn9Kvo6M+f628xsp5n90sx+P6f94uhxdprZl8zM0ux3WrQPQkTq1UTkLN7p7p3u3hV9/0ngKXc/B3gq+h4zOw+4BlgMXA78jZk1R/e5F1gFnBP9u3wC+l1R63v3sOyujVx732aW3bWRDb17qt0lEZHEqpHgXg48EN1+ALgyp/0hdx909xeAncASM5sDzHD3Te7uwLqc+9QF7YMQkXqXdrBw4B/MbKuZrYraZrv7PoDo61lR+zxgd859+6K2edHtse0nMbNVZtZjZj0HDhyo4NMYn3L3QfS80M+af/glPS/0p9k9EZGS0k5wL3P3vWZ2FvCEmf1zkWsL5SG8SPvJje5rgbUAXV1dBa+phnL2QVx73zM8vTMbJL60cSdvXzSLr//J0lT7KSISJ9WRhbvvjb6+DHwXWALsj6aWiL6+HF3eB8zPuXsHsDdq7yjQXjdC90H0vND/eqAY9ZOd/RphiEjVpDayMLOpQJO7H4lu/x5wO7ABuA64M/q6PrrLBuBBM1sDzCWbyN7i7hkzO2JmS4HNwErgy2n1Oy0h+yB+/KtXYtu73jgrrS6KiMRKcxpqNvDdaJVrC/Cgu//AzH4GdJvZ9cAu4GoAd99hZt3Ac8AwcKO7Z6LHugG4H2gHHo/+1Z2k+yAu6jg1qF1EJG2pBQt3/zVwUYH2fuBdMfe5A7ijQHsPcH6l+1irzpg+hWaDTE7Wpdmy7SIi1aDaUDWoY2Y7rS1jVk+1NKkwoIhUjYJFDVJhQBGpNaoNVaNUGFBEaomCRQ1TYUARqRWahhIRkZIULBqIqtqKSFo0DTVG/8BgXeYJ1vfu4ZZHttFsTWR8hM9fdRFXdBYsoSUiEkzBIsf63j3c+uh2WpuaGBoZ4e73X1gXb7j9A4Pc3N3L8AhAdh/jTd29LFt0Rl0FPBGpXZqGitRzGfEde1+NAsUJwyPZdhGRSlCwiJRbRrw2xBXYrZnCuyJS5xQsIuWUEa8Vc08t3Me4dkg/Ga5ku0hjUc4iMrpr+pYxOYt6mPN/7XiG1mZjKKeYVGuz8drxTMHrs8nw7TQ3GZkR5/NXVTY3U6+5HxGJp2CRI+1d02mttJp6SnNeoAAYyjhTT2k+6dr+gUE+/vC2vOtvfnhbxZLhubmfY2RHarc8ul3JdpE6p2AxRuiu6aQBIM1P268dz9DWbAzmBIC2mJHFjr2HCwaWHXsPc+m5Z510fajR3M9ooIATuR8FC5H6pWAxDkkDQNqftjtmtmNN+TXNrcli8i2FTqkt1h7el3rN/YhIPCW4yxSy1DbtlVYhVWoXz53BmOrntDRl2ye6LyJSPzSyKFPIdMtEfNpOmm+ZNa2NNSs6+cSY3d6VfDNXxVyRxqNgUaaQADBRK62S5lsm4s1cFXNFGouCRZlCA0CtfdrWm7mIhEg9WJhZM9AD7HH395jZZ4E/BQ5El3zK3R+Lrr0NuJ5sgaOPuvsPo/aLgfuBduAxYLW7V317cmgASGulVbnXp6mW+iIi4zcRI4vVwPNAbgb1i+7+hdyLzOw84BpgMTAXeNLMznX3DHAvsAp4hmywuBx4fAL6XlJan9BDl9rW0ka4WuqLiFRGqquhzKwD+APgvgSXLwcecvdBd38B2AksMbM5wAx33xSNJtYBV6bV51oQWtSwnCKIaZXjqOeCjCISL+2ls38J3AKMqYnKR8xsu5l9zcxmRm3zgN051/RFbfOi22PbT2Jmq8ysx8x6Dhw4UOiSuhC61Lbv4FF8JH9Wzkc89vr1vXt4651P8Udrn+Gtdz7Fht49JfuUNLjUd0FGEYmTWrAws/cAL7v71jE/uhd4E9AJ7APuGb1LgYfxIu0nN7qvdfcud+8688wzy+p3LeiY2c6x4fzd18eGM7FLbaee0py3extgsEi5j5u7exkcdn4zlGFw2Lmpu7doEFjfu4dld23k2vs2s+yujUWDizbliTSmNEcWy4ArzOxF4CHgMjP7hrvvd/eMu48AXwGWRNf3AfNz7t8B7I3aOwq0N7Sx+fti+fzRQoK54goJhp59ETqtpE15Io0ptQS3u98G3AZgZu8APu7u15rZHHffF132XuAX0e0NwINmtoZsgvscYIu7Z8zsiJktBTYDK4Evp9XvWtB38CjtrS0cGRx+va29tSW2vlJIIcFXjw4V/J1x7eXUeqq1ZcIiMn7V2Gdxt5l1kp1KehH4MIC77zCzbuA5YBi4MVoJBXADJ5bOPk6NrISCsCWiSa8NncoJKSQ4o7214GPEtZc7raR9HCKNZUKChbv/CPhRdPuDRa67A7ijQHsPcH5K3StbyBLRkGtDN/yFFBJcPHcGRn7Sx4ivDVXP53yISOVoB3eZQirJllN1NmQqZ9a0NlZc3MG6Z3a93raiqyP2Pk3RoUe53xdzRec8zpszg97dh+icfxqLZk8ver2INB5VnS1TyBLRcpeTzprWxkXzT0s0vdW9tS+vrbunr2ASesfew3mBAiAzkj3PIs763j2856+f5s//13O856+fTrTUVkQai4JFmULm8tNeThoWjMLOs9AmOxEBBYuyhSwRTXs5aUgwGs1Z5CqWs9AmOxEB5SzGJSSvkOZy0lnT2ljR1cG6TclyFi3NlrfUtqU5PmdR7qhIhQRFGotGFuOUNK8Qem2I/oFBvrVld17bt7bsjj21b0pL/v6LKS3NsSOFckZFITu+RaQ+aGTRAHbsPVxwU96OvYe59Nyz8to7ZrZzdGg4r+3o0HDRkULIqCjt88ZFpDoULBpCWNLax+y08Jzr4qaPkm6yK2fHdxxNZYnUDgWLBrB47gxamsir+dTSVDhpXWzp7MHfDI37HIpKrfzSmRgitUU5iwYwa1oba1Z00tZivKG1mbYWY82KzoKfxuNqQO05eLQiS2RHNwjmKpZsL0TLdUVqj0YWDSJpXmFG+ykF2weHRyoyfRS3QXD1u85N/DiVnMoSkcrQyKKBJFltNTpllaulCd626IyKTB9VYl+GzsQQqT0KFpPMrGltfGDJgry2D/zOAhbNnl50iWzSk/JCD26K66POxBCpLZqGmmT6BwZ5cMuuvLYHN+9i9bvOjZ3KCk02hxzcFEdnYojUFo0sGsjO/Ud4pGc3O/cfib2m1El5Y6eyQpPNowc35Ro9uClUWpsYRSScRhYN4tPfezavRPnKSxZw+/ILClwZ9ym/cHtosln5BpHGpJFFA9i5/0heoABYt2lXwRHG3FMLv2nHtYfu+Fa+QaQxaWTRAHp3H4ptH3tQ0d7Dxwpeu/fwsdhDjczyd3xnv4+nfINI41GwaACd808LaA+fhprS0sxQ5sToYrTwYKnT+xQkRBpH6tNQZtZsZv9kZt+Pvj/dzJ4ws19FX2fmXHubme00s1+a2e/ntF9sZs9GP/uSlfpoO8ksmj2dlZfkL4ddecmCgiOFcqahlIMQkYnIWawGns/5/pPAU+5+DvBU9D1mdh5wDbAYuBz4GzMbraV9L7AKOCf6d/kE9Luu3L78Ah758FI+etkiHvnw0pjkNrx2PMOU1vyXfUprE68dzxS8XjkIEYGUp6HMrAP4A+AO4KaoeTnwjuj2A8CPgFuj9ofcfRB4wcx2AkvM7EVghrtvih5zHXAl8Hiafa83uXsh1v7k17F7IeJGBOWWKFdlWJHJIe2RxV8CtwC58xiz3X0fQPR19MCFeUDuCT59Udu86PbY9pOY2Soz6zGzngMHDlTkCdSDkL0Q5Y4UCu150CFHIpNHaiMLM3sP8LK7bzWzdyS5S4E2L9J+cqP7WmAtQFdXV/i24ToVuheiEquVdMiRyOSS5jTUMuAKM3s3MAWYYWbfAPab2Rx332dmc4CXo+v7gPk59+8A9kbtHQXaJVJOEnq8q5UqXRlW01kitS21aSh3v83dO9x9IdnE9UZ3vxbYAFwXXXYdsD66vQG4xszazOyNZBPZW6KpqiNmtjRaBbUy5z7CxCShxxYSLBWgkhYeBE1nidSDauyzuBPoNrPrgV3A1QDuvsPMuoHngGHgRncfXaJzA3A/0E42sa3k9hhpboSLKyR49/sv5JYx7bOmtQUVHtR0lkh9sHIqgtaDrq4u7+npqXY36l7/wCDL7trIsaETo4gprU389NbLmDWt7aTpo1LXj7Vt9yGuvW8zRwZPbPqb3tbCN/7kd7goZrOhiKTHzLa6e9fYdtWGkqJKHWY0dpVU6OFH2vQnUh8ULBpISJ4gqdDDjELf/LXpT6Q+qDZUgwg9oChEyGFGo2/+hXIZcVR4UKT2KVg0gDSTxKOHGeXmFEYPM4p77Cs653HenBn07j5E5/zTYqvZ5gpZyqtltiITT8GiAVR6z0OucnIKaY5y0nxsEYmnnEUDSDNJPGtaGysu7shrW9HVERuEQo9hDZHmY4tIcQoWDSDNJHH/wCDf+tnuvLZvbdld9AzukNVQIdJ8bBEpTtNQDSKtJPGOvYcZyuQntIcyzo69h7n03LNOuj7NUY6W2YpUj0YWDaRQZdjxevXocFB7mqMcLbMVqR6NLKSoGe2F/xeJa4d0l8Jqma1IdShYTKBaWvKZtC+hx7COCq1qG/K30fneIhNPwWKC1NKSz5C+vHY8Q1uzMZiTt2hrtthjWNPuj4hUh3IWE6CWlnyG9qVjZjvWlH/+lDVZyaRy0tIjtfS3EZF4ChYToJaWfIb2pZykcsj5FOX8bdKogSUixWkaagLU0pLP0MKAEJZUDi09Evq30ZSVSHUkGlmY2eokbVJYrS35DCkMOCrpstw0Ry6ashKpnqQji+uAvxrT9qECbRKjVpZ8llMYEJKvVipnFJX0b5NmDSwRKa5osDCzPwI+ALzRzDbk/Gg60J9mxxpRLSz5TLswYDklykfvV+qaWprOE5lsSo0s/g+wDzgDuCen/QiwPa1OSXpC38zLKX+e1iiq3EAkIuNXNFi4+0vAS8AloQ9sZlOAHwNt0e95xN0/Y2afBf4UOBBd+il3fyy6z23A9UAG+Ki7/zBqvxi4H2gHHgNWe6MeHj4BQt7My536SWsUVc5ZGSIyfolyFmb2PuAu4CzAon/u7jOK3G0QuMzdB8ysFXjazB6PfvZFd//CmN9xHnANsBiYCzxpZue6ewa4F1gFPEM2WFwOPI6ULembea1N/Wg1lEh1JN1ncTdwhbuf6u4z3H16iUCBZw1E37ZG/4qNBpYDD7n7oLu/AOwElpjZHGCGu2+KRhPrgCsT9lvGqZZWcmk1lEj1JF0Ntd/dnw99cDNrBrYCi4D/4e6bzew/Ah8xs5VAD3Czux8E5pEdOYzqi9qGottj2wv9vlVkRyAsWLAgtLsSYyJWciVZbdV38Cg+MmbZ74hrNZTIBCi1Gup90c0eM/s28D2y00sAuPt3it0/mkLqNLPTgO+a2flkp5Q+R3aU8TmyifM/Jju1ddJDFGkv9PvWAmsBurq6lNOoop37jyTOKySdWpp6SnNejSqAwYwz9ZTmivZdRE5WamTxhzm3fwP8Xs73DhQNFq9f6H7IzH4EXJ6bqzCzrwDfj77tA+bn3K0D2Bu1dxRolwkSmif49PeeZd0zu17/fuUlC7h9+QUFrw1ZbfXa8QytzZZ3GFNrhYsaikhhpVZD/ZdyH9jMzgSGokDRDvwucJeZzXH3fdFl7wV+Ed3eADxoZmvIJrjPAba4e8bMjpjZUmAzsBL4crn9kjChS2d37j+SFygA1m3axcqlCwuOMEJWW009pbngqX0aWYikL+lqqC8VaD4M9Lj7+pi7zQEeiPIWTUC3u3/fzL5uZp1kRyYvAh8GcPcdZtYNPAcMAzdG01gAN3Bi6ezjaCXUhIkr0xGXJ+jdfajg9b27DxUMFiGrrSaiXLqIFJY0wT0F+G3g4ej79wM7gOvN7J3u/rGxd3D37cCbC7R/MO6XuPsdwB0F2nuA8xP2VSpo6inNHBvKfzM/NjQS+2m+c/5pQe0hG+1eL5eeEyySlEuvJbV0AJZIiKTBYhHZPRPDAGZ2L/APwH8Ank2pb1IDQj/NL5o9nZWXLGDdpvycRbEkd9LVVvW+g1t7RKSeJQ0W84CpZKeeiG7PjfIJWuTewDpmtpMZs1k+41700/ztyy9g5dKFqeyyrpWCjKHKKZsiUkuSBou7gd5oRZMBlwL/3cymAk+m1DepEWZG7mrl7PfFLZo9PXGQCP3Eneb53mlRxVypd4mChbt/1cweA5aQDRafcvfR5aufSKtzUn19B48ypaWZocyJkuZTWpor9iZXzifukDf/9b17uOWRbTRbExkf4fNXXVSVqZ9aK5siEqpouQ8z++3o678ju7ppN7AL+K2oTRpcuW9ySY8+DT0sKeTI1v6BQW7u7mVw2PnNUIbBYeem7t6qlAeppbIpIuUoNbK4iWz5jHsK/MyByyreI6kp5SSVQ6aVQoJR6Chkx95XGc5/aIZHsu2XnntmqadecfWabxGB0pvyVkVf3zkx3ZFalOYZ3CHBKHTe/9Wjxwv2Ma5dROIl3ZT3BrKjjAXuvsrMzgH+rbt/v8RdpUEkTSqXk8hNGoxCp8RmtLcGtY9KKyGupbNSz5KWKP874Djw1uj7PuC/pdIjqWvl5jhmTWvjovmnlTxQKWTef/HcU2kas3CrybLtcUJyIiFUXl3qXdKls29y9/8UncmNux+1JOsnZdJJe+Nc6Lx/c5MxkrOhsHls9MiR5l4ILZ2Vepc0WByPigE6gJm9iZxS5VIb0txPEPLYaSdyQ6bEQpb9pvmGrqWzUu+SBovPAD8A5pvZN4FlwIfS6pSES3M+vJzHTusMbkgeuELfoNN8Q6/3UiUi5l76jCAz+zrZGlBHgV8Dm939lZT7Ni5dXV3e09NT7W5MiP6BQZbdtTGv4N+U1iZ+eutl434zSvOxyxG6yW5D756T3qAreX2oWthNLlKMmW11966x7UlHFn8HvI1s4cB/Q7b0x4/d/a8q2EcpU5rTJ7U01z66yS67dyJbyPCm7t6iOYXQKbErOudx3pwZqdS1gnRHXCJpSlruY6OZ/W/gLcA7gf8KLAYULGpAOdMnaU3lpKncTXYhb9DVWt6qEYfUuqT7LJ4iW2l2E/AT4C3u/nKaHZPkQufDQ94Qa2uuPW7KtDLHrVerMqz2X0g9SDoNtR24mOwBRIeBQ2a2yd0LF/CRCZd0uqWcN8RaKVOxeO6pBc/gLrZvIkQ1ptxUulzqRaJNee7+Z+5+Kdkzs/vJ5jAOpdgvKUOSjW2hhftCHjtts6a1cc/VF9HW0sQbTmmmraWJe66+qGJ9Knc6L0nBxDjlvh4iEy3pNNRHgLeTHV28BHyN7HSU1JlaykGUI80EdJrTeXHq/fWQySPpNFQ7sAbYOnq0ailmNgX4MdAW/Z5H3P0zZnY68G1gIfAisMLdD0b3uQ24nuxSl4+6+w+j9ouB+6N+PAas9iRrfuUktZWDCJf2/H6a03mFVPr1UKJc0pJ0NdTny3jsQbLndg+YWSvwtJk9DrwPeMrd7zSzTwKfBG41s/OAa8iuspoLPGlm57p7BriXbKn0Z8gGi8uBx8vok1A7OYhQEzW/n2T1VCXzG5V6PZQolzQlLSQYzLMGom9bo38OLAceiNofAK6Mbi8HHnL3QXd/AdgJLDGzOcAMd98UjSbW5dxHylQLOYhQfQeP4iP5A0of8arM73fMbOfYcCav7dhwpuzpo/G+HipUKGlLLVgAmFmzmfUCLwNPuPtmYLa77wOIvp4VXT6P7El8o/qitnnR7bHthX7fKjPrMbOeAwcOVPS5SPVNPaWZwUx+sBjMOFNPaa5Kf8bOhCaZGQ1NiKd14qBIqKQ5i7JEU0idZnYa8F0zO7/I5YXKgXqR9kK/by2wFrLlPsJ6K7Vu7+HCb3x7Dx+t+E7rUvoOHqW9tYUjgydSeO2tLUWnoUKnidI6cVCkHKmOLEa5+yHgR2RzDfujqSWir6Ob+/qA+Tl36wD2Ru0dBdplknn1aOG1FXHt5UryaT70zTl0mij0ep3xLWlLbWRhZmcCQ+5+KCpv/rvAXcAG4Drgzujr+uguG4AHzWwN2QT3OcAWd8+Y2REzWwpsBlYCX06r39J4QlYIJf00H7qKKTQhnuaJgyLlSHMaag7wgJk1kx3BdLv7981sE9BtZtcDu4CrAdx9h5l1A88Bw8CN0TQWwA2cWDr7OFoJNSmVc0xqyFRO6GqrkD0f5ZRLLyeBrkKFkpbUgoW7bwfeXKC9H3hXzH3uAO4o0N5DttSITGKL586gpYm8YoItTdn2QkLf/EM/zaddY6ucBLpIWlJNcItU0qxpbaxZ0cknxpxnUWzqJ6690H1CPv2nXWOrnAS6SJoULKSuhLzhTj2lOe/QJoBjQyOxS21DPv2nXXRQq5uk1ihYSN1JOi//2vEMbc2Wtzejrdl47Xgm9j5J8xDlvJnXb2l4EQULqUMhBzdZk0FOsLAmq8gbeuibeT2XhhcBBQupM9kzuLfT3GRkRpzPX1W5T+flrIYKyUGUM22l1U1SKxQspG70Dwzy8Ye35R1+dPPD2yqaVA59Q0/6Zq4chNS7CdnBLVIJO/YezgsUAEMZZ8few0Xvl7RIX5pv6NphLfVOIwupI4XKhBVrD5N2Ulk5CKlnChZSN0I35ZUj7Tf00BxE6GFGOvxI0qJgIXUjdFPeeH5PLbzRplmlViSUNWoJga6uLu/p6al2NyQFk+HTc//AIMvu2pi3qXBKaxM/vfWygs859HqAnfuPpHKWudQ3M9vq7l1j2zWykLpTK5/8Ib3AVU6V2rjHKXT9p7/3LOue2fX69ysvWcDtyy+oQM+lUSlYSMNL6w09zWmf0KqzIaVNdu4/khcoANZt2sXKpQs1wpBYChbS0NJ6Qy9nR3aokKqzIaVNencfKvgYvbsPKVhILO2zkIYVetpciLTPvB6tOptrtOpsIa+XNskRV9qkc/5pBR8jrl0EFCykgZX7hp70WNVyDidKqmNmO0eH8o+LPTo0HPv4IZv+Fs2ezspLFuS1rbxkgUYVUpSmoaRhpV0ZNu3DicwM8DHfxwvZI3Lx2afz7Z/1Mfobus4+vTKdloalkYU0rNASGyHTVqHTRKH6Dh5lSkt+cnpKS3NFHn/0eQ4Oj3BseITB4cpNz0nj0shCGlpahQTTLgyY5qgodJmtCKQ4sjCz+Wb2j2b2vJntMLPVUftnzWyPmfVG/96dc5/bzGynmf3SzH4/p/1iM3s2+tmXrNR4XCRHGoUEyy0MmCQfkvv4bS1NvOGUZtpaKjcqCj1BUATSHVkMAze7+8/NbDqw1cyeiH72RXf/Qu7FZnYecA2wGJgLPGlm57p7BrgXWAU8AzwGXA48nmLfZRIKLSQYWkcqdBmvj/7X83MXhYSMiso5QVAktWDh7vuAfdHtI2b2PFBsgfty4CF3HwReMLOdwBIzexGY4e6bAMxsHXAlChaSgrQKCYbuyziRV3AgU/L6kNVZ5ZwgmLbJUMKl3k1IzsLMFgJvBjYDy4CPmNlKoIfs6OMg2UDyTM7d+qK2oej22PZCv2cV2REICxYsKHSJSElJy4mEjBTKKd8RehBT0tVZ5ZZir8ed8FI5qQcLM5sGPAp8zN1fNbN7gc+RHVd/DrgH+GMKH0rgRdpPbnRfC6yFbCHB8fdepLDQkUJowjr0+tHVWUcGT+zNGF2dNd4jYaG+d8JLZaS6dNbMWskGim+6+3cA3H2/u2fcfQT4CrAkurwPmJ9z9w5gb9TeUaBdJBVJktChG/5CE9ahCfRyNgkmTfzX8054qZzURhbRiqWvAs+7+5qc9jlRPgPgvcAvotsbgAfNbA3ZBPc5wBZ3z5jZETNbSnYaayXw5bT6LZNb0k/Q5SxtdcB9hEymCWck9rpRoZ/+09okWM6UWFI6m7x+pDmyWAZ8ELhszDLZu6NlsNuBdwJ/BuDuO4Bu4DngB8CN0UoogBuA+4CdwL+g5LakIOQT9Kxpbazo6shrW9HVUXRp683dvRzPwGBmhOMZuKm7N9ES2iSf/svZJJh0GW9o6ZEQOpu8fqS5GuppCucbHitynzuAOwq09wDnV653IicL+QTdPzBId09fXlt3Tx+r33VuwTe6HXtfzTsOFrLHw+7Y+yqXnnvmuPseOg0VmoMILT0SQmeT1weV+xCJhEyJhM61v3r0eFB7OZJOQ4XmINIsPTIq6QhKqkfBQhpe6K7pJFMioXPtM9pbg9pDhUxDhQY65RUEVBtKGtz63j3c8sg2mq2JjI/w+asuKjrdknRKJHSvwuK5p9LabAzlbIRrbTYWzz11fE8wEvKGHvrmP/pcP/HIdpqbjMyIK68wCWlkIQ1rNKk8OOz8ZijD4LBXNKl8Rec8fnrrZXzjT36Hn956WdEgNGtaG/dcfVHe0tl7rr6oYm+4IaOicpLKJ0qPnPiukpKO/qR6NLKQhlVuUjlkp3LS3d6QfiL3is55nDdnBr27D9E5/7SihxmF9CW09Eio7OjvxKjl81dpB3ctUrCQBhb3CTj+k3HapSdCgguEBa7QviftS5olzfsHBvn4w9vypudufnibdnDXIE1DScMazRPkKpYnSHOncjnW9+5h2V0bufa+zSy7ayMbevfEXltO35NO/aRZ0nzH3sN5gQJgKOPs2Ht43I8tlaVgIQ0rNE9QTumJtObay1neGtL39b17uOQvnmTF/9zEJX/xZNFA9M//eiSoPUzcfo3i+ziU45h4moaShhYyNx+6SijNKavQqZ+QvvcPDPKxh3qjybjsfVY/1Bs79fNS/2sF+xLXnvt7Sv3dF8+dQUsTebmllqZsexxVqa0OjSyk4SVd3RSySijtKavQqZ+Qvm/6l/6TsjYetRdy9qypQe2QfApt1rQ21qzopK3FeENrM20txpoVnRU5EVAqSyMLkRxJRyJpFteD8k6zi1sNNfYT/isDxwreP679kjfNCmoPLTsesoor7b+7xFOwEBkjySqhtHc1l3OaXaHpGYeT2t626Ezg+ZPun20vrNCGwjihb+gh00ppFjWU4jQNJXWnFpKbaVdLDX38QtMzn3hkG7c8cvKUzcypp7DykvyTJFdesiD2E31obajQ/EnotNLYIoaVLGoo8TSykLpSS8nNidhkl/TxC32ab7amkxYVjX7Cv/js03loyy6M7NkaXWefHvvY5ZQHecvZM/nJzhM5kLecPTP2+NjhTP5jD2dGih43O6WlmaHMidHFaODSNFS6NLKQulGLyc20q6UmffxC0zODwxkyI2P2MIxkk+S3Pro972yNYn/H0FHOzv1H8gIFwE929rNz/8lLbYeGMwV32Q8NF87NqKhh9ShYSN3QEZzFjZ2OaWoyPv2H5530Jv/a8Uzw3zGkDlbv7kOJ21/s/03Ba+Pay53+q4Wpy3qnaSipG/pUGS9ueub8uafy01svy5vK6h8YLOvvmLQ8yMJZb0jc3jn/tILXxrVD+PRfaOVhKUwjC6kbOoIzXrFAOnYqa/TvmLuzvZJ/x9aW5oJlVlpbTt4jsmj29KBke6hyKw/LyTSykLqiIzgLCz1f40TJ8fzjUiuhY2Y7zU35S22biyz7vX35BaxcujDRPgsIW+SQ9nG2k0lqwcLM5gPrgN8iW1Ngrbv/lZmdDnwbWAi8CKxw94PRfW4DridbB/mj7v7DqP1i4H6gnewZ3qs97sxIaXihlVsni6SBNO2S46GBC7IjjCSjidANf+VUHg6p9FvO9fUqzZHFMHCzu//czKYDW83sCeBDwFPufqeZfRL4JHCrmZ0HXAMsBuYCT5rZue6eAe4FVgHPkA0WlwOPp9h3kbqUJJBOxC7otEaAoX0PPaEwdGl2LS3lTltqOQt33+fuP49uHyG7ZXQesBx4ILrsAeDK6PZy4CF3H3T3F4CdwBIzmwPMcPdN0WhiXc59RCRQuQsFQlcUpbGsuJw9H0krD4cuza7FpdxpmpCchZktBN4MbAZmu/s+yAYUMzsrumwe2ZHDqL6obSi6Pba90O9ZRXYEwoIFCwpdIjLplTNNNHqa3WiGo1qn2ZXT97TqfU22OlWpBwszmwY8CnzM3V8tsjW/0A+8SPvJje5rgbUAXV1dymmIxAg9VvWm7m15G/z+rLt6p9ld0TmPuadO4ce/eoVLzzmDrjcWLmiYK2m9r2NjNgMeG87Ejlom21LuVJfOmlkr2UDxTXf/TtS8P5paIvr6ctTeB8zPuXsHsDdq7yjQLiLjkHSaaNO/9J+0Ezwz4rElzdP26e89y1V/+wxf2riTq/72GT69/tmKPfbYdTPF1tFMtqXcaa6GMuCrwPPuvibnRxuA64A7o6/rc9ofNLM1ZBPc5wBb3D1jZkfMbCnZaayVwJfT6reI5HslZg4+rj1NO/cfYd0zu/La1m3axcqlC8e9N6Pv4FHaW1s4MnhiY2N7a0vRaaXJtJQ7zWmoZcAHgWfNrDdq+xTZINFtZtcDu4CrAdx9h5l1A8+RXUl1Y7QSCuAGTiydfRythBKZMG9bdEZQe5qKlRIZb7Aod1ppsizlTi1YuPvTxB+k+66Y+9wB3FGgvQc4v3K9E5GkRndZr9t04hN9JXdZhyinPEhS5STPJxPt4BaRkm5ffgFXXDg3KKmchkWzp/P2RbPyqtq+fdGsigWucqaVtClPRCSSu/ls7U9+nWjzWRpvov0Dg/zspYN5bT976SD9A4MVPXgq6WNNpk15ChYiUlR4iY303kRraW9DOX+XeqaqsyJSVOg5ImnubK6lvQ3lnq9Sr2drKFiISFGhb9BpHlI1EXsbkr6ZlxO41vfuYdldG7n2vs0su2sjG3r3VKTPE0HTUCJSVOgqobQ//ae5tyFk+iz071Lv01YKFiJSUsgb9EQsQU1jb0M5b+Yhf5dayreUQ8FCRBIJeYOux53N5b6ZJ/271FK+pRzKWYhIKtIoUZ6mtN/M672WlEYWIiJMzPRZPY64RilYiIhEJuLNvF5rSWkaSkREStLIQkQkMpnKd4TSyEJEhMY4UzvN3eEaWYiIUP/7INIeFWlkISJCfe+DmIhRkYKFiAj1vQ8izXpcozQNJSISqdd9EBMxKtLIQkQkR73tPIeJGRWlNrIws68B7wFedvfzo7bPAn8KHIgu+5S7Pxb97DbgeiADfNTdfxi1XwzcD7QDjwGr3d3T6reISD1Ke1SU5sjifuDyAu1fdPfO6N9ooDgPuAZYHN3nb8ysObr+XmAVcE70r9BjiohMemmOilILFu7+Y+D/Jbx8OfCQuw+6+wvATmCJmc0BZrj7pmg0sQ64MpUOi4hIrGrkLD5iZtvN7GtmNjNqmwfszrmmL2qbF90e216Qma0ysx4z6zlw4EDcZSIiEmiig8W9wJuATmAfcE/UbgWu9SLtBbn7WnfvcveuM888c5xdFRGRURMaLNx9v7tn3H0E+AqwJPpRHzA/59IOYG/U3lGgXUREJtCEBosoBzHqvcAvotsbgGvMrM3M3kg2kb3F3fcBR8xsqZkZsBJYP5F9FhGRdJfOfgt4B3CGmfUBnwHeYWadZKeSXgQ+DODuO8ysG3gOGAZudPdM9FA3cGLp7OPRv5K2bt36ipm9VKGnU8oZwCsT9LuqabI8T5g8z3WyPE+YPM91vM/z7EKNpi0L42dmPe7eVe1+pG2yPE+YPM91sjxPmDzPNa3nqR3cIiJSkoKFiIiUpGBRGWur3YEJMlmeJ0ye5zpZnidMnueayvNUzkJERErSyEJEREpSsBARkZIULAJE9axeNrNf5LSdbmZPmNmvoq8ziz1GvYh5rp81sz1m1hv9e3c1+1gJZjbfzP7RzJ43sx1mtjpqb7jXtchzbajX1cymmNkWM9sWPc8/j9ob8TWNe64Vf02VswhgZpcCA8C6nDM67gb+n7vfaWafBGa6+63V7GclxDzXzwID7v6FavatkqKqAnPc/edmNh3YSray8YdosNe1yHNdQQO9rlG1h6nuPmBmrcDTwGrgfTTeaxr3XC+nwq+pRhYBYsquLwceiG4/QIOUUA8sMV+33H2fu/88un0EeJ5sZeOGe12LPNeG4lkD0bet0T+nMV/TuOdacQoW4zc7qmFF9PWsKvcnbYVKzDcEM1sIvBnYTIO/rmOeKzTY62pmzWbWC7wMPOHuDfuaxjxXqPBrqmAhIeJKzNc9M5sGPAp8zN1frXZ/0lTguTbc6xpVt+4kW6l6iZmdX+UupSbmuVb8NVWwGL/9o9V0o68vV7k/qSlSYr6uRXO9jwLfdPfvRM0N+boWeq6N+roCuPsh4Edk5/Ab8jUdlftc03hNFSzGbwNwXXT7Ohq4hHqREvN1K0oQfhV43t3X5Pyo4V7XuOfaaK+rmZ1pZqdFt9uB3wX+mcZ8TQs+1zReU62GCpBbdh3YT7bs+veAbmABsAu42t3rPjEc81zfQXZY+3qJ+dE54HplZm8DfgI8C4xEzZ8iO5ffUK9rkef6RzTQ62pmF5JNYDeT/UDc7e63m9ksGu81jXuuX6fCr6mChYiIlKRpKBERKUnBQkRESlKwEBGRkhQsRESkJAULEREpScFCRERKUrAQEZGSWqrdAZFGERXn+wHZMtFLgW3A3wF/TrZo3X8G3k22Zs88YD5wt7t/xcyagL8G/j3wAtkPcl9z90cm+GmIFKRgIVJZi4CrgVXAz4APAG8DriC7W7oXuJBsMJkK/JOZ/T3wVmAhcAHZwPI88LWJ7bpIPE1DiVTWC+7+bFTAbQfwlGfLJDxLNhgArHf3o+7+CvCPZIu8vQ142N1H3P1fo3aRmqFgIVJZgzm3R3K+H+HESH5sjR0HLOV+iYyLgoXIxFsenZ08i2xxxp+RzXO838yazGx21C5SM5SzEJl4W4C/J1v99HPuvtfMHgXeRbaU9P8lW/X2cPW6KJJPVWdFJpCZfRYYcPcvFPjZNHcfiEYcW4BlUf5CpOo0shCpHd+PDrI5heyIQ4FCaoZGFiIiUpIS3CIiUpKChYiIlKRgISIiJSlYiIhISQoWIiJS0v8HXD+c/sWPZXUAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "ax = df.plot(x='mpg', y='weight', kind='scatter')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Of course, we can customize our subplot with the `ax` object it returned, as usual."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEICAYAAABYoZ8gAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAWwElEQVR4nO3dfbRddX3n8feHiAWEFpALRiBGLdVaR4O9Yru0UxS0CKPATFFpq8GlBp06lVVnxsg4Q+yq00zHx1GXGivLCOqIgyD1YWxkRIdWwYARcYILq1EgmSQ8DcQ6UOA7f+x9p6fJvbknyd3n5N79fq111t2PZ39/bPK5+/72U6oKSVJ/HDDuAiRJo2XwS1LPGPyS1DMGvyT1jMEvST1j8EtSzxj82u8luSbJa8ddx/4gyflJ3juG7R6TZGOSXxj1tjX3DH7NqSSbkvw8yY6Bz+PHXddCkOTRwNuA/zzqbVfVVuBrwIpRb1tzz+BXF15SVYcOfDaPu6AF4kzglqq6Y0zb/yRw/pi2rTlk8Gsk2r8ETh0YX5Xk0oHx30jyN0nuTfLdJCcP+b0HJFmZ5G+T3JXksiRHDsx/3sD33pbkvHb6LyX5RJLtSX6S5G1JDmjnnZfk2iTvTHJPkh8nefHAdz4+yVVJ7k7ywySv26ldn01yaZL7k3wvya8keWuSbW0NL2qXPSfJDTu1581JrpyhuS8Gvj6w7NIkleTV7ffek+T1SZ6d5Ka2zR8YWP68JH+d5P1J/k+SW5KcMjD/iUm+0db91SQfHNxHwHXAk5I8YZh9o/2Xwa+xS3Is8EXgT4EjgX8NXJ5kYojV/wg4C/ht4PHAPcAH2+9dAnwZeD8wASwDNrTrvR/4JeBJ7bqvAl498L3PAX4AHAX8OfCxJGnnfRq4vd3e7wL/cTBAgZcAlwBHAN8BvkLzb+1Y4E+Aj7TLXQU8McmvDqz7B+260/knbU07ew5wAvBy4L3AvwNOBX4NeFmS395p2R+17boI+NzAL8pPAdcDjwVWAa8c3EhVPQT8EHjmDPVpvqgqP37m7ANsAnYA97afKwemnzqw3Crg0nb4LcAlO33PV4Dl7fA1wGtn2N5G4JSB8cXA3wOPAt4KXDHNOouAB4CnDUw7H7imHT4P+OHAvEOAAh4HHA88DBw2MP/PgI8PtGvdwLyXtP89FrXjh7XfdXg7/iHgHe3wr9H84vqFGdp6K3DawPjS9ruOHZh2F/DygfHLgQsG2rUZyMD862kCfgnwEHDIwLxLp/bRwLS/Bl417v/P/OzbxyN+deGsqjq8/Zw1xPJPAM5puybuTXIv8DyaEB9m3SsG1ttIE8zH0IT0306zzlHAo4GfDEz7Cc0R+ZT/PTVQVX/XDh5Kc5R/d1Xdv5t1tw4M/xy4s6oeHhif+i6AtcDvtX9NvBK4rKoemKGt99D84tjZztvbefzQgfE7qk3wgdofP9CuvxuYd9s02zqM5he65jGDX6PyM5oj5ymPGxi+jeaI//CBz2OqavUQ33sb8OKd1j2omhOgtwFPnmadO2n+Khjsq14CDHPSdDNwZJLBAB523V1U1beAB4HfAn6Pmbt5AG4CfmVvtjPg2IEuK2hq3wxsoWnX4D46fnDFJI8Cfhn47j7WoDEz+DUqG4BXJDkwySRN3/iUS4GXJPmdJIuSHJTk5CTHDfG9HwbeMXXCMclEkjPbeZ8ETk3ysiSPSvLYJMvao+/L2vUOa9f947aO3aqq24C/Af6srfMZwGvabe2tTwAfAB6qqmt3s9yXaM5H7IujgT9q98M5wK8CX6qqnwDrgVVJHp3kN2m6qQadBGxql9U8ZvBrVP49zdH3PcDbaU4kAv8/TM8ELgS20xyp/xuG+//zfTQnSf8qyf3At2hOYFJVPwVOB94M3E3zy2fqxOS/ovkr5EfAtW09Fw/ZlnNp+tc3A1cAF1XVuiHXnc4lwNPZ/dE+wF8CT93H+yKuozkRfCfwDuB3q+qudt7vA79Jc57gT4HP0JwLYWD+h/dh29pP5B9390katSQHA9uAZ1XVrbMsu4LmpPQFe7Gd82hOkj9vyOU/Q3PfwEVJjqa5lPTEqvq/e7pt7V8eNe4CJPEG4NuzhT5AVa3pqogkz6b5y+jHwIto/gpb3W53G023kBYAg18aoySbgNDcizBujwM+R3Md/+3AG6rqO+MtSV2wq0eSesaTu5LUM/Oiq+eoo46qpUuXjrsMSZpXbrjhhjurapdHn8yL4F+6dCnr168fdxmSNK8kmfaeC7t6JKlnDH5J6hmDX5J6xuCXpJ4x+CWpZwx+SeoZg1+Sesbgl6SeMfglqWfmxZ2789XSlV8cy3Y3rT5jLNuVND94xC9JPdN58LfvUP1Oki+040cmWZfk1vbnEV3XIEn6B6M44n8TsHFgfCVwdVWdAFzdjkuSRqTT4E9yHHAG8BcDk88E1rbDa9k/3jwkSb3R9RH/e4F/CzwyMO2YqtoC0P48eroVk6xIsj7J+u3bt3dcpiT1R2fBn+SfAduq6oa9Wb+q1lTVZFVNTkzs8h4BSdJe6vJyzucCL01yOnAQ8ItJLgW2JllcVVuSLAa2dViDJGknnR3xV9Vbq+q4qloKvAL4H1X1B8BVwPJ2seXA57uqQZK0q3Fcx78aeGGSW4EXtuOSpBEZyZ27VXUNcE07fBdwyii2K0nalXfuSlLPGPyS1DMGvyT1jMEvST1j8EtSzxj8ktQzBr8k9YzBL0k9Y/BLUs8Y/JLUMwa/JPWMwS9JPWPwS1LPGPyS1DMGvyT1jMEvST3T5cvWD0pyfZLvJvl+kre301cluSPJhvZzelc1SJJ21eUbuB4AXlBVO5IcCFyb5MvtvPdU1Ts73LYkaQadBX9VFbCjHT2w/VRX25MkDafTPv4ki5JsALYB66rqunbWG5PclOTiJEfMsO6KJOuTrN++fXuXZUpSr3Qa/FX1cFUtA44DTkrydOBDwJOBZcAW4F0zrLumqiaranJiYqLLMiWpV0ZyVU9V3QtcA5xWVVvbXwiPAB8FThpFDZKkRpdX9UwkObwdPhg4FbglyeKBxc4Gbu6qBknSrrq8qmcxsDbJIppfMJdV1ReSXJJkGc2J3k3A+R3WIEnaSZdX9dwEnDjN9Fd2tU1J0uy8c1eSesbgl6SeMfglqWcMfknqGYNfknrG4JeknjH4JalnDH5J6hmDX5J6xuCXpJ4x+CWpZwx+SeoZg1+Sesbgl6SeMfglqWcMfknqmS5fvXhQkuuTfDfJ95O8vZ1+ZJJ1SW5tfx7RVQ2SpF11ecT/APCCqnomsAw4LclvACuBq6vqBODqdlySNCKdBX81drSjB7afAs4E1rbT1wJndVWDJGlXXb5snfZF6zcAvwx8sKquS3JMVW0BqKotSY6eYd0VwAqAJUuWdFmm5sjSlV8c27Y3rT5jbNuW5ptOT+5W1cNVtQw4DjgpydP3YN01VTVZVZMTExOd1ShJfTOSq3qq6l7gGuA0YGuSxQDtz22jqEGS1Ojyqp6JJIe3wwcDpwK3AFcBy9vFlgOf76oGSdKuuuzjXwysbfv5DwAuq6ovJPkmcFmS1wA/Bc7psAZJ0k46C/6qugk4cZrpdwGndLVdSdLueeeuJPWMwS9JPWPwS1LPGPyS1DOd3rmr8RjnHbR9493Kmo884peknjH4JalnDH5J6hmDX5J6xpO70jw1rhPLnlSe/zzil6SeMfglqWcMfknqGYNfknpmqODfk1cmSpL2b8Me8X84yfVJ/uXUW7UkSfPTUMFfVc8Dfh84Hlif5FNJXri7dZIcn+RrSTYm+X6SN7XTVyW5I8mG9nP6PrdCkjS0oa/jr6pbk7wNWA/8F+DEJAEurKrPTbPKQ8Cbq+rGJIcBNyRZ1857T1W9c1+LlyTtuaGCP8kzgFcDZwDrgJe0gf544JvALsFfVVuALe3w/Uk2AsfOVeGSpL0zbB//B4AbgWdW1R9W1Y0AVbUZeNtsKydZSvP+3evaSW9MclOSi5McsedlS5L21rDBfzrwqar6OUCSA5IcAlBVl+xuxSSHApcDF1TVfcCHgCcDy2j+InjXDOutSLI+yfrt27cPWaYkaTbDBv9XgYMHxg9pp+1WkgNpQv+TU+cBqmprVT1cVY8AHwVOmm7dqlpTVZNVNTkxMTFkmZKk2Qwb/AdV1Y6pkXb4kN2t0J74/RiwsarePTB98cBiZwM3D1+uJGlfDXtVz8+SPGuqbz/JrwM/n2Wd5wKvBL6XZEM77ULg3CTLgAI2AefvYc2SpH0wbPBfAHw2yeZ2fDHw8t2tUFXXAplm1peGrk6SNOeGCv6q+naSpwJPoQnzW6rq7zutTJLUiT15EcuzgaXtOicmoao+0UlVkqTODHsD1yU0l2BuAB5uJxdg8EvSPDPsEf8k8LSqqi6LkSR1b9jLOW8GHtdlIZKk0Rj2iP8o4H8luR54YGpiVb20k6okSZ0ZNvhXdVmEJGl0hr2c8+tJngCcUFVfbZ/Ts6jb0iRJXRj21YuvA/4b8JF20rHAlR3VJEnq0LAnd/+Q5hEM90HzUhbg6K6KkiR1Z9jgf6CqHpwaSfIomuv4JUnzzLDB//UkFwIHt+/a/Szwl92VJUnqyrDBvxLYDnyP5mmaX2KIN29JkvY/w17VM/XSlI92W44kqWvDPqvnx0zTp19VT5rziiRJndqTZ/VMOQg4Bzhy7suRJHVtqD7+qrpr4HNHVb0XeEG3pUmSujBsV8+zBkYPoPkL4LBZ1jme5rHNjwMeAdZU1fuSHAl8hubZ/puAl1XVPXtcuSRprwzb1fOugeGHaAN7lnUeAt5cVTcmOQy4Ick64Dzg6qpanWQlzRVDb9mjqiVJe23Yq3qev6dfXFVbgC3t8P1JNtI86uFM4OR2sbXANRj8kjQyw3b1/PHu5lfVu2dZfylwInAdcEz7S4Gq2pJk2kc/JFkBrABYsmTJMGVKkoYw7A1ck8AbaI7YjwVeDzyNpp9/tr7+Q4HLgQuq6r5hC6uqNVU1WVWTExMTw64mSZrFnryI5VlVdT9AklXAZ6vqtbtbKcmBNKH/yar6XDt5a5LF7dH+YmDb3pUuSdobwx7xLwEeHBh/kOaqnBklCfAxYONOXUFXAcvb4eXA54esQZI0B4Y94r8EuD7JFTR38J5Nc6nm7jwXeCXwvSQb2mkXAquBy5K8Bvgpzc1gkqQRGfaqnnck+TLwW+2kV1fVd2ZZ51ogM8w+ZfgSJUlzadiuHoBDgPuq6n3A7Ume2FFNkqQODfvqxYtorrV/azvpQODSroqSJHVn2CP+s4GXAj8DqKrNzHIZpyRp/zRs8D9YVUX7aOYkj+muJElSl4YN/suSfAQ4PMnrgK/iS1kkaV6a9aqe9nr8zwBPBe4DngL8h6pa13FtkqQOzBr8VVVJrqyqXwcMe0ma54bt6vlWkmd3WokkaSSGvXP3+cDrk2yiubInNH8MPKOrwiRJ3dht8CdZUlU/BV48onokSR2b7Yj/Spqncv4kyeVV9S9GUJMkqUOz9fEPPmvnSV0WIkkajdmCv2YYliTNU7N19TwzyX00R/4Ht8PwDyd3f7HT6iRJc263wV9Vi0ZViCRpNPbkscySpAWgs+BPcnGSbUluHpi2KskdSTa0n9O72r4kaXpdHvF/HDhtmunvqapl7edLHW5fkjSNzoK/qr4B3N3V90uS9s6wj2yYS29M8ipgPfDmqrpnuoWSrABWACxZsmSE5Wk+Wrryi+MuQZo3Rn1y90PAk4FlwBbgXTMtWFVrqmqyqiYnJiZGVJ4kLXwjDf6q2lpVD1fVIzQvcjlplNuXJI04+JMsHhg9G7h5pmUlSd3orI8/yaeBk4GjktwOXAScnGQZzeMfNgHnd7V9SdL0Ogv+qjp3mskf62p7kqTheOeuJPWMwS9JPWPwS1LPGPyS1DMGvyT1jMEvST1j8EtSzxj8ktQzBr8k9cw4HsssaR4b5yOwN60+Y2zbXkg84peknjH4JalnDH5J6hmDX5J6xuCXpJ4x+CWpZzoL/iQXJ9mW5OaBaUcmWZfk1vbnEV1tX5I0vS6P+D8OnLbTtJXA1VV1AnB1Oy5JGqHOgr+qvgHcvdPkM4G17fBa4Kyuti9Jmt6o+/iPqaotAO3Po2daMMmKJOuTrN++ffvICpSkhW6/PblbVWuqarKqJicmJsZdjiQtGKMO/q1JFgO0P7eNePuS1HujDv6rgOXt8HLg8yPeviT1XpeXc34a+CbwlCS3J3kNsBp4YZJbgRe245KkEersscxVde4Ms07papuSpNnttyd3JUndMPglqWcMfknqGYNfknrGd+5KmjfG9b7fhfauX4/4JalnDH5J6hmDX5J6xuCXpJ5Z8Cd3x3UySJL2Vx7xS1LPGPyS1DMGvyT1jMEvST1j8EtSzxj8ktQzY7mcM8km4H7gYeChqpocRx2S1EfjvI7/+VV15xi3L0m9ZFePJPXMuIK/gL9KckOSFWOqQZJ6aVxdPc+tqs1JjgbWJbmlqr4xuED7C2EFwJIlS8ZRoyQtSGM54q+qze3PbcAVwEnTLLOmqiaranJiYmLUJUrSgjXy4E/ymCSHTQ0DLwJuHnUdktRX4+jqOQa4IsnU9j9VVf99DHVIUi+NPPir6kfAM0e9XUlSw8s5JalnDH5J6hmDX5J6xuCXpJ4x+CWpZwx+SeoZg1+Sesbgl6SeMfglqWcMfknqmXG+gUuS5oWlK784tm1vWn3GnH+nR/yS1DMGvyT1jMEvST1j8EtSzxj8ktQzBr8k9cxYgj/JaUl+kOSHSVaOowZJ6qtxvGx9EfBB4MXA04Bzkzxt1HVIUl+N44j/JOCHVfWjqnoQ+K/AmWOoQ5J6aRx37h4L3DYwfjvwnJ0XSrICWNGO7kjygxHUBnAUcOeItjVOfWkn9KetfWkn9Kit+U/71NYnTDdxHMGfaabVLhOq1gBrui/nH0uyvqomR73dUetLO6E/be1LO8G27qtxdPXcDhw/MH4csHkMdUhSL40j+L8NnJDkiUkeDbwCuGoMdUhSL428q6eqHkryRuArwCLg4qr6/qjr2I2Rdy+NSV/aCf1pa1/aCbZ1n6Rql+51SdIC5p27ktQzBr8k9Uyvgz/JxUm2Jbl5YNqRSdYlubX9ecQ4a5wLM7RzVZI7kmxoP6ePs8a5kOT4JF9LsjHJ95O8qZ2+EPfpTG1dUPs1yUFJrk/y3badb2+nL8R9OlNb53yf9rqPP8k/BXYAn6iqp7fT/hy4u6pWt88ROqKq3jLOOvfVDO1cBeyoqneOs7a5lGQxsLiqbkxyGHADcBZwHgtvn87U1pexgPZrkgCPqaodSQ4ErgXeBPxzFt4+namtpzHH+7TXR/xV9Q3g7p0mnwmsbYfX0vxjmtdmaOeCU1VbqurGdvh+YCPNneILcZ/O1NYFpRo72tED20+xMPfpTG2dc70O/hkcU1VboPnHBRw95nq69MYkN7VdQfP+T+VBSZYCJwLXscD36U5thQW2X5MsSrIB2Aasq6oFu09naCvM8T41+PvrQ8CTgWXAFuBdY61mDiU5FLgcuKCq7ht3PV2apq0Lbr9W1cNVtYzmLv+Tkjx9zCV1Zoa2zvk+Nfh3tbXtP53qR9025no6UVVb2//JHgE+SvPU1Hmv7Ru9HPhkVX2unbwg9+l0bV2o+xWgqu4FrqHp816Q+3TKYFu72KcG/66uApa3w8uBz4+xls5M/aNpnQ3cPNOy80V7cuxjwMaqevfArAW3T2dq60Lbr0kmkhzeDh8MnArcwsLcp9O2tYt92verej4NnEzziNetwEXAlcBlwBLgp8A5VTWvT4zO0M6Taf50LGATcP5Un+l8leR5wP8Evgc80k6+kKbve6Ht05naei4LaL8meQbNydtFNAeql1XVnyR5LAtvn87U1kuY433a6+CXpD6yq0eSesbgl6SeMfglqWcMfknqGYNfknrG4JeknjH4Jaln/h93rpX/KuFTqwAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "ax = df.mpg.plot(kind='hist')\n",
    "ax.set_title('Fuel economy (mpg)')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## [Assignment 2]\n",
    "\n",
    "In this assignment you will plot a Hertzprung-Russell (H-R) diagram which shows stellar magnitude over color and can tell us about the populations of stars we are observing (see image below).\n",
    "\n",
    "In the data we imported in assignment 1 (choose any import link / method you'd like), you will find DataFrame columns for G-band magnitude (`Gmag`), and color as magnitude of the blue part (`BPmag`), and magnitude of the red part (`RPmag`).\n",
    "\n",
    "On H-R diagrams (see image), we want absolute magnitude on the y-axis with the brightest stars at the top (**note**: low stellar magnitudes are brighter). We also want to plot color as `BPmag` - `RPmag` on the x-axis.\n",
    "\n",
    "In our DataFrame, we only have an apparent magnitude, so we will need to derive the absolute magnitude before we can make our plot.\n",
    "\n",
    "We can compute *Absolute magnitude* ($M$)  if we know the *apparent magnitude* ($m$) and the *distance modulus* ($\\mu$) by:\n",
    "\n",
    "$$\\mu = m - M$$\n",
    "\n",
    "The *distance modulus* can be calculated from the distance to a star in *parsecs* (pc):\n",
    "\n",
    "$$\\mu = 5log_{10}\\left(\\frac{d_{pc}}{10 \\ pc}\\right)$$\n",
    "\n",
    "Finally, we can determine a star's distance in parsecs from the measured parallax in milliarcseconds (mas) by:\n",
    "\n",
    "$$d_{pc} = \\frac{1000}{parallax}$$\n",
    "\n",
    "Note parallax is `Plx` in our DataFrame. Now we know all we need to make an H-R diagram and identify some stars!\n",
    "\n",
    "![HR diagram](https://upload.wikimedia.org/wikipedia/commons/6/6b/HRDiagram.png)\n",
    "\n",
    "\n",
    "### 2a) Convert apparent magnitude to absolute magnitude\n",
    "\n",
    "Using the equations in the question, derive the absolute G magnitude ($M$) from the apparent G magnitude for each star in the DataFrame. Hint: you will need columns `Gmag`, `BPmag`, `RPmag`, and `Plx`.\n",
    "\n",
    "In each step, save the result as a new column in your DataFrame, for example:\n",
    "\n",
    "`df['distance_modulus'] = # calculation for distance modulus here`\n",
    "\n",
    "Your final DataFrame should have columns for `distance_modulus`, `distance_parsecs`, and `absolute_magnitude`.\n",
    "\n",
    "\n",
    "### 2b) Calculate the color\n",
    "\n",
    "With the data provided, we can find color using blue part (`BPmag`) - red part (`RPmag`). Save this `color` difference as a column of your DataFrame.\n",
    "\n",
    "\n",
    "### 2c) Plot the HR diagram\n",
    "\n",
    "Using pandas `DataFrame.plot(x='color', y='absolute_magnitude')`, plot the H-R diagram. Make sure to label your axes!\n",
    "\n",
    "### 2d) Identify and select the white dwarfs\n",
    "\n",
    "White dwarfs on the H-R diagram are outliers in the lower left. \n",
    "\n",
    "Using a **conditional** statement (or two) on absolute magnitude and color, select all white dwarfs and save them to their own DataFrame. Call `.describe()` on this new DataFrame to get summary statistics of the white dwarfs.\n",
    "\n",
    "\n",
    "### 2e) Identify and select the giants\n",
    "\n",
    "Giants on the H-R diagram are outliers in the top middle to right section of the diagram.\n",
    "\n",
    "Using a **conditional** statement (or two) on absolute magnitude and color, select all giants and save them to their own DataFrame. Call `.describe()` on this new DataFrame to get summary statistics of the giants.\n",
    "\n",
    "### 2f) Plot the white dwarfs and giants separately\n",
    "\n",
    "Using your white dwarf and supergiant DataFrames, plot each on the same axes as your H-R diagram with distinct markers and colors (hint: try changing the `markerstyle`, `color`, `markersize` options of `.plot()`).\n",
    "\n",
    "To plot all on the same axes, make a subplot and pass in ax to each `.plot` call, e.g.\n",
    "\n",
    "```Python\n",
    "fig, ax = plt.subplots()\n",
    "df.plot(ax=ax)\n",
    "df2.plot(ax=ax)\n",
    "...\n",
    "```\n",
    "\n",
    "### 2g) Write the white dwarfs and giants to CSVs\n",
    "\n",
    "Write your white dwarf and giants DataFrames to CSV files with `.to_csv()`, using the `index=False` option. Include these CSV files in the assignment 5 directory and in your commit / PR.\n",
    "\n",
    "\n",
    "### 2h) BONUS: Label the white dwarfs and giants\n",
    "\n",
    "Research the `ax.annotate()` method of matplotlib axes and label the white dwarf and supergiant points on your plot as \"white dwarf\" and \"supergiant\" with arrows pointing to each. As always, bonus points for any other plot aesthetics you add!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# [your code here]\n"
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "33800beb71e4bcd6bcba11240441ea568cc35964f729435babc1b5d8f2854e0f"
  },
  "kernelspec": {
   "display_name": "Python 3.9.7 64-bit ('astropy': conda)",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}