Assignment 2 (Sep 28)

Today’s topics will include:

  1. Python dictionary (dict) objects

  2. Python for loops

  3. Plotting in matplotlib using Axes objects

  4. Simple polynomial fitting using np.polyfit

Readings (optional)

If you find this week’s material new or challenging, you may want to read through some or all the following resources while working on your assignment:

Python dictionary (dict) objects

In week 1, we talked about passing the g value into the free_fall_h_grav function so one can get the free fall distance on different planets in our Solar System.

However, what if we want to let the user supply the name of the planet instead of its g value?

In this section, we will use the Python dictionary (dict or in the form of {}) to do the job.

WTD?!

A dictionary is Python’s version of a “key-value” store data type, meaning we can access a value in a dict by supplying its key (or “name”).

The following cell defines a new dictionary where each planet name is a key and each g value is… well… a value!

Notice the new syntax:

  • Curly braces {}

  • Key-value pairs separated by :

  • Different items separated by , (new line is optional but makes it easier to read)

planet_g_dict = {
    # key    : value
    'mercury': 3.61,
    'venus'  : 8.83,
    'earth'  : 9.80665,
    'mars'   : 3.75,
    'jupiter': 26.0,
    'saturn' : 11.2,
    'uranus' : 10.5,
    'neptune': 13.3,
}

We can access a value by giving its associated key in square brackets []:

print(planet_g_dict['mars'])
3.75

Now you don’t have to memorize or copy/paste every g-value! All you need is your planet_g_dict and the name of a planet!

Dictionaries have some useful methods like the .keys() and .values() methods shown below:

# Return all keys or values in a dict
print(planet_g_dict.keys())

print(planet_g_dict.values())
dict_keys(['mercury', 'venus', 'earth', 'mars', 'jupiter', 'saturn', 'uranus', 'neptune'])
dict_values([3.61, 8.83, 9.80665, 3.75, 26.0, 11.2, 10.5, 13.3])

Note that these aren’t regular lists, they are special dict_keys() and dict_values() objects. But it’s pretty straightforward to convert them to the lists we know and love:

# Use `list` function to get a list of keys
planets = list(planet_g_dict.keys())
print(planets)
['mercury', 'venus', 'earth', 'mars', 'jupiter', 'saturn', 'uranus', 'neptune']

Keys are case sensitive!

Be forewarned: dictionary keys are case-sensitive!

planet_g_dict['Earth']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/tmp/ipykernel_22015/2417589539.py in <module>
----> 1 planet_g_dict['Earth']

KeyError: 'Earth'

To make sure we don’t get an error if a user tries to pass in a Capitalized (or, heaven-forbid. an ALL CAPS) key, we can convert input strings to lowercase with the str.lower() method:

planet_name = 'Mars'
planet_name_lower = planet_name.lower()  # Convert to lowercase
print(planet_name_lower)
print(planet_g_dict[planet_name_lower])  # All good now :)
mars
3.75

[Shortcut] Zip up two lists into a dict

Say we start with two lists one with keys and the other with values that we want to turn into a dict.

The long (but completely valid) way of doing this would be by adding each key-value pair to an empty dict using a loop (we’ll cover loops in the next section!):

planets = ['mercury', 'venus', 'earth', 'mars', 'jupiter', 'saturn', 'uranus', 'neptune']
planet_g = [3.61, 8.83, 9.80665, 3.75, 26.0, 11.2, 10.5, 13.3]
planet_g_dict = {}
for i in range(len(planets)):
    planet_g_dict[planets[i]] = planet_g[i]
planet_g_dict
{'mercury': 3.61,
 'venus': 8.83,
 'earth': 9.80665,
 'mars': 3.75,
 'jupiter': 26.0,
 'saturn': 11.2,
 'uranus': 10.5,
 'neptune': 13.3}

To simplify this process, we can use a Python function called zip. Let’s see what happens when we zip our two lists:

zip(planets, planet_g)
<zip at 0x7f80b45979c0>

Ok we got a zip object… but what’s in there? Like before, let’s convert this weird object to a list:

list(zip(planets, planet_g))
[('mercury', 3.61),
 ('venus', 8.83),
 ('earth', 9.80665),
 ('mars', 3.75),
 ('jupiter', 26.0),
 ('saturn', 11.2),
 ('uranus', 10.5),
 ('neptune', 13.3)]

Now we can see it paired up each planet with the associated g-value from the two lists!

Now what happens if we use the dict function instead of list?

planet_g_dict = dict(zip(planets, planet_g))
planet_g_dict
{'mercury': 3.61,
 'venus': 8.83,
 'earth': 9.80665,
 'mars': 3.75,
 'jupiter': 26.0,
 'saturn': 11.2,
 'uranus': 10.5,
 'neptune': 13.3}

Now we have a really quick way to make a dictionary from two associated lists!

Global Variables

In Python, any variable that is outside of a function is called a global variable. While it’s good practice to avoid global variables wherever possible, there is an important style guide to remember if you use globals:

  • Always name global variables in ALL_CAPS.

Using the ALL_CAPS convention helps us remember the global variable was defined somewhere different from our local variables.

Let’s define our planet_g_dict as a global variable and use it in our free-fall distance function from last week:

# Global variables should always be ALL CAPS
PLANETS = ['mercury', 'venus', 'earth', 'mars', 'jupiter', 'saturn', 'uranus', 'neptune']
PLANET_G = [3.61, 8.83, 9.80665, 3.75, 26.0, 11.2, 10.5, 13.3]
PLANET_G_DICT = dict(zip(planets, planet_g))


# Define our free-fall function with planet name as a parameter
def free_fall_dist_on_planet(t, planet='mars'):
    """
    Free fall distance given time t on given planet.

    Args:
        t (float or np.array): 
            Time in [s].
        planet (str, optional): 
            Planet to calculate the free-fall height for. Default: 'mars'.

    Returns:
        float or np.array: Free fall distance in [m].
    """
    planet = planet.lower()  # convert planet name to lower case
    g = PLANET_G_DICT[planet]  # get g from global dict
    H = 0.5 * g * t**2
    return H

# Call the function
print(free_fall_dist_on_planet(10, 'earth'))
print(free_fall_dist_on_planet(10, 'Neptune'))  # Works with caps now!
print(free_fall_dist_on_planet(10, 'VENUS'))  # Even all caps!
490.3325
665.0
441.5

Python for loops

A for loop cycles through each element in a list, array, tuple, or other iterable and runs some commands for each one.

The anatomy of a for loop is:

  • The for (it’s in the name after all)

  • The loop variable (this is what will change each cycle)

  • The iterable (the thing we’re looping through)

  • An indented block (the commands we want to run multiple times in the loop)

See more on Ch. 3.7 Loops

# Recall our planets list
planets = ['mercury', 'venus', 'earth', 'mars', 'jupiter', 'saturn', 'uranus', 'neptune']

# Below, "planet" is the loop variable and will be replaced by each planet in the list
for planet in planets:
    # Everything indented here will be run each time in the loop
    print('Hello', planet)
Hello mercury
Hello venus
Hello earth
Hello mars
Hello jupiter
Hello saturn
Hello uranus
Hello neptune

Sometimes it’s more useful to have the index that we are on in the loop, rather than the actual value in the list.

E.g. 0, then 1, then 2… rather than mercury, venus, earth

We often use the function range() with for loops to get at the current index (generally called i).

range(10)
range(0, 10)

To see every element in this range object, we can again convert it to a list:

list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The main uses of range are:

  • range(val): all integers from 0 to val-1

  • range(start_val, stop_val): all integers from start_val to stop_val-1

  • range(start, stop, step): all integers from start to stop-1 counting by step

# Print all odd numbers from 1 to 20
print('Without range:')
odd_list = [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
for i in odd_list:
    print(i, end=' ')  # end=' ' prevents newline

print('\nWith range:')
for i in range(1, 20, 2):
    print(i, end=' ')
Without range:
1 3 5 7 9 11 13 15 17 19 
With range:
1 3 5 7 9 11 13 15 17 19 

Be mindful of whether you are using the index (e.g. with range) or looping through items directly when you make a for loop. Often we can get at the same info both ways

For example:

# Loop over planet names in the planets list
print('Loop over names')
planets = list(planet_g_dict.keys())
for planet in planets:
    g = planet_g_dict[planet]  # get the g value for each planet
    print(f'The g value for {planet:>8} is about {g:5.2f} m/s^2')


# Loop over the index of the planets list
print('\nLoop over indices:')
for i in range(len(planets)):
    planet = planets[i]  # get the planet name
    g = planet_g_dict[planet]  # get the g value for each planet
    print(f'The g value for {planet:>8} is about {g:5.2f} m/s^2')
Loop over names
The g value for  mercury is about  3.61 m/s^2
The g value for    venus is about  8.83 m/s^2
The g value for    earth is about  9.81 m/s^2
The g value for     mars is about  3.75 m/s^2
The g value for  jupiter is about 26.00 m/s^2
The g value for   saturn is about 11.20 m/s^2
The g value for   uranus is about 10.50 m/s^2
The g value for  neptune is about 13.30 m/s^2

Loop over indices:
The g value for  mercury is about  3.61 m/s^2
The g value for    venus is about  8.83 m/s^2
The g value for    earth is about  9.81 m/s^2
The g value for     mars is about  3.75 m/s^2
The g value for  jupiter is about 26.00 m/s^2
The g value for   saturn is about 11.20 m/s^2
The g value for   uranus is about 10.50 m/s^2
The g value for  neptune is about 13.30 m/s^2

Plotting using axis objects

Here we will introduce the concept of axes in matplotlib. These are useful for making subplots and formatting specific portions of your plot area.

See matplotlib color palette sheet for a list of colors we can use in matplotlib.

import numpy as np
import matplotlib.pyplot as plt

Below is a simple example of adding axes to a matplotlib Figure object.

  1. Make the Figure with plt.figure

  2. Add subplot axes to the Figure with plt.subplot

  3. Plot different data on each subplot axis (called ax1 and ax2)

  4. Label each subplot axis also through ax1 and ax2

# Make the main Figure and color it gray
f = plt.figure(facecolor='tab:gray', figsize=(8,4))
f.subplots_adjust(wspace=0.4)

# Add in subplot axes (give them different colors so we can find them)
ax1 = plt.subplot(121, facecolor='tab:red')
ax2 = plt.subplot(122, facecolor='tab:blue')

x1 = np.linspace(0, 100, 1000)

# Plot on subplot 1
ax1.plot(x1, x1**2, lw=2, color='k')
ax1.set_xlabel('x')
ax1.set_ylabel('$y=x^2$')

# Plot on subplot 2
ax2.plot(x1, -x1**2, lw=2, color='k')
ax2.set_xlabel('x')
ax2.set_ylabel('$y=-x^2$')
plt.show()
_images/f21_a2_35_0.png

Below we show an example using a for loop to plot different free-fall curves for each of our planets.

Note: we only need to set the axis labels once and put the .plot() part in the loop.

x = np.arange(0, 100, 5) # falling time [s]

# --- plotting setup ----
f = plt.figure(facecolor='white', figsize=(4,3), dpi=250)
ax1 = plt.subplot(111)


# --- plotting all free-fall curves for SS objects (planets) ----
for target in planets:
    y = free_fall_dist_on_planet(x, target)
    g = planet_g_dict[target]
    ax1.plot(x, y/1e3, '-', lw=1, label = f'g={g:05.2f} ({target})')
    

# --- plotting adjustments ---   
ax1.tick_params(axis='both', which='both', labelsize='small', right=True, top=True, direction='in')   
ax1.set_xlabel('Time [sec]', size='medium', fontname='Helvetica')
ax1.set_ylabel('Distance [km]', size='medium', fontname='Helvetica')
ax1.legend(fontsize=6, ncol=2)
plt.show()
findfont: Font family ['Helvetica'] not found. Falling back to DejaVu Sans.
_images/f21_a2_37_1.png

Polynomial fitting using np.polyfit

We will explore simple fitting in Python using NumPy polyfit.

For a more in depth video tutorial, see NumPy Polyfit Tutorial (7 mins).

Note: We use np.polyfit() because it is a simple function, but NumPy recommends using the more advanced numpy.polynomial.polynomial library instead. You can read more about it here.

Like with any function, we can read the documentation right from Python with help():

help(np.polyfit)

Below we define some arrays that you can think of as x and y measurements from an experiment.

x = np.array([0, 3.448, 6.897, 10.345, 13.793, 17.241, 20.69, 24.138, 27.586, 31.034, 34.483, 37.931, 41.379, 44.828, 48.276, 51.724, 55.172, 58.621, 62.069, 65.517, 68.966, 72.414, 75.862, 79.31, 82.759, 86.207,  89.655,  93.103, 96.552, 100])
y = np.array([3.403, 12.761, 22.422, 34, 43.701, 58.525, 67.867, 71.998, 87.14, 86.822, 130.066, 79.842, 104.269, 119.987, 153.925, 86.582, 273.286, 119.237, 211.643, 270.858, 313.429, 253.465, 375.543,  65.478, 353.615, 346.736, 434.706, 343.462, 358.991, 477.203])
y_err = np.array([0.3, 1.334, 2.369, 3.403, 4.438, 5.472, 6.507, 7.541, 8.576, 9.61, 35.483, 38.931, 42.379, 45.828, 49.276, 52.724, 56.172, 59.621, 63.069,  66.517, 104.948, 110.121, 115.293, 120.466, 125.638, 130.81 , 135.983, 141.155, 146.328, 151.5])

We can plot x, y, and the y-errors y_err using the ax.errorbar() function.

f = plt.figure(facecolor='white', figsize=(4,3), dpi=200 )
ax1 = plt.subplot(111)

ax1.errorbar(x, y, yerr=y_err, fmt='.', ms=5, lw=0.5)
      
ax1.tick_params(axis='both', which='both', labelsize='small', right=True, top=True, direction='in')   
ax1.set_xlabel('x', size='medium', style='normal', family='sans-serif', fontname='Helvetica')
ax1.set_ylabel('y', size='medium', style='normal', family='sans-serif', fontname='Helvetica')
plt.show()
_images/f21_a2_43_0.png

To fit a line to our data, we can use np.polyfit(x, y, order), where order is the highest exponent in the polynomial we’re fitting.

# Fit a 1st order (linear) polynomial to our x, y data
fit_coeffs = np.polyfit(x, y, 1)

print(fit_coeffs)
[  4.1371426  -28.15839687]

NumPy polyfit returns the coefficients of the of the fit from highest to lowest order. For example, our linear fit gave us the following equation of a line:

print(f'y = {fit_coeffs[0]:1.2f}x + {fit_coeffs[1]:1.2f}')
y = 4.14x + -28.16

Convert polyfit coefficients to a function using np.poly1d

Another convenient NumPy function is np.poly1d which takes our polyfit coefficients and turns them into a function. This lets us create a new array from the equation of the line with any x values we choose.

# --- plotting setup ----
f = plt.figure(facecolor='white', figsize=(4,3), dpi=200)
ax1 = plt.subplot(111)

# --- plot the data with error ----
ax1.errorbar(x, y, yerr=y_err, fmt='.', ms=5, lw=0.5)

# --- fitting ---
coeffs = np.polyfit(x, y, 1)  # These are the linear coefficients
fit_linear = np.poly1d(coeffs)  # This returns a function object

# --- plot the fitted line ---
ax1.plot(x, fit_linear(x), '-', lw=1)  # We can call our function to draw the line
    
# --- plotting adjustments ---   
ax1.tick_params(axis='both', which='both', labelsize='small', right=True, top=True, direction='in')   
ax1.set_xlabel('x', size='medium', style='normal', family='sans-serif', fontname='Helvetica')
ax1.set_ylabel('y', size='medium', style='normal', family='sans-serif', fontname='Helvetica')
plt.show()
_images/f21_a2_49_0.png

Weighting a fit by its errors

A final useful parameter in np.polyfit is the weights parameter, w. This allows us to weight a fit by its errors.

In our case, we want to weight points with the smallest errors the highest in our fit so we supply w=1/y_err to np.polyfit().

# --- plotting setup ----
f = plt.figure(facecolor='white', figsize=(4,3), dpi=250 )
ax1 = plt.subplot(111)

# --- plot the data with error ----
ax1.errorbar(x, y, yerr=y_err, fmt='.', ms=5, lw=0.5, label='Data')

# --- fitting ---
coeffs = np.polyfit(x, y, 1)
fit_linear = np.poly1d(coeffs)

# --- fitting with error as weight ---
coeffs_w = np.polyfit(x, y, 1, w=1/y_err)
fit_linear_w = np.poly1d(coeffs_w)

ax1.plot(x, fit_linear(x), '-', lw=1, label='fitting w/o error considered')
ax1.plot(x, fit_linear_w(x), '-', lw=1, label='fitting w/  error considered')
    
    
ax1.tick_params(axis='both', which='both', labelsize='small', right=True, top=True, direction='in')   
ax1.set_xlabel('x', size='medium', style='normal', family='sans-serif', fontname='Helvetica')
ax1.set_ylabel('y', size='medium', style='normal', family='sans-serif', fontname='Helvetica')
ax1.legend(fontsize = 'xx-small', loc = 'upper left')
plt.show()
_images/f21_a2_51_0.png

[Assignment] Determine the planet based on free-fall observations

Below are 30 data points taken from an object falling on a unknown Solar System planet by measuring the free-fall distance [m] and time [s]. Each data of the falling distance is an average value from 10 different measurements and their standard deviation is given as error, fall_distance_err [m].

  1. Use np.polyfit() to fit the data with a quadratic (2nd order polynomial) function with no weights (no w= parameter)

  2. Obtain the g value from your fit coefficients.

  3. Print the name of the most likely planet given g with the unweighted fit.

  4. Plot the data with errorbars as well as your fit.

  5. Make a new np.polyfit() fit this time weighted by y_err (use the w= parameter).

  6. Obtain the g value from your weighted fit.

  7. Print the name of the most likely planet given g with the weighted fit.

  8. Did you get a different answer?

Make sure to label your plot axes and units. For extra practice, customize your plot to your liking and include the free fall equation using the ax.annotate() function.

Free-fall equation:

\[H = \frac{1}{2} g t^2\]
time_steps = np.array([20, 27.966, 35.931, 43.897, 51.862, 59.828, 67.793, 75.759, 83.724, 91.69, 99.655, 107.621, 115.586, 123.552, 131.517, 139.483, 147.448, 155.414, 163.379, 171.345, 179.31, 187.276, 195.241, 203.207, 211.172, 219.138, 227.103, 235.069, 243.034, 251])
fall_distance = np.array([2798.322, 4543.523, 5459.432, 11554.559, 15958.431, 20023.776, 19146.256, 22765.371, 47183.159, 47167.289, 22978.494, 66253.599, 63625.642, 91050.12, 116941.625, 143460.073, 106462.323, 142584.887, 199564.683, 83593.839, 158030.907, 205442.175, 206733.665, 241555.039, 236078.303, 240508.665, 311193.102, 298704.903, 339502.307, 438338.605])
fall_distance_err = np.array([448, 875.919, 1445.964, 2158.136, 3012.435, 4008.861, 5147.413,  6428.093,  7850.899,  9415.832, 18538.153, 21620.131, 24938.986, 28494.72 , 32287.332, 36316.821, 40583.189, 45086.435, 49826.558, 54803.56 , 18005.232, 19640.459, 21346.75 , 23124.104, 24972.521, 26892.002, 28882.547, 30944.154, 33076.825, 35280.56])
# [Your answer and plots here]