In [None]:
"""RUN THIS CELL"""
import matplotlib.pyplot as plt
from math import sqrt
import numpy as np

# Activity 15 - Applications of Least Squares

### Least Squares

In this problem we will be finding polynomial curves of best fit.  Consider the following data.

|$x$ | $y$|
|----|----|
|-3  | 56 |
|-2  | 40 |
|-1  | 21 |
|0   | 14 |
|1   | 8  |
|5   | -9 |
|6   | -1 |

Let's begin by plotting the data.

In [None]:
x = [-3,-2,-1,0,1,5,6]
y = [56,40,21,14,8,-9,-1]
plt.plot(x,y,'ro')
plt.show()

Now to find the line of best fit $ y=mx+b $, the first step is to set up the coefficient matrix and enter it here.

In [None]:
A = np.matrix([[-3,1],[-2,1],[-1,1],[0,1],[1,1],[5,1],[6,1]])

Instead of the projection method, we will use a more specialized method called the pseudoinverse method.  This method requires two things:
1. The matrix must be tall and narrow (more rows than columns),
2. The columns must be linearly independent.

The first step is to multiply both sides of the equation by $ A^T $ to convert it into a consistent square system $$ A^TAx = A^Tb $$

In [None]:
b = np.matrix(y).T
ATA = A.T*A
ATb = A.T*b

Next, we will solve this system by multiplying both sides by the inverse of $ A^TA $.  If either of the two requirements listed above are not met, then $ A^TA $ will not be invertible and we would have to solve the system using row operations.

In [None]:
x_ls = np.linalg.inv(ATA)*ATb
print(x_ls)

Now let's add this line to our graph.

In [None]:
# x and y values for the line of best fit
line_x = [0.1*t-3 for t in range(91)]
line_y = [x_ls[0,0]*x+x_ls[1,0] for x in line_x]
# original data
x = [-3,-2,-1,0,1,5,6]
y = [56,40,21,14,8,-9,-1]
# both graphed together
plt.plot(x,y,'ro',line_x,line_y)
plt.show()

**Exercise:**  Now use this method to find the parabola of best fit for the same data.