Question: n this programming assignment, please write codes from scratch ( do not use preexisting libraries from anywhere ) for ( 1 ) linear models and

n this programming assignment, please write codes from scratch (do not use preexisting libraries from
anywhere) for (1) linear models and (2) gradient descent algorithm.
Task 1. Classifying MNIST data
Use the MNIST data provided along with this assignment. There are two files: MNIST_training_HW1.csv
and MNIST_test_HW1.csv. In those two datasets, there are 95 and 50 samples respectively for each
label of 0 and 1. consequently, we will solve a binary classification problem.
You will train a linear regression model using the training data (MNIST_training_HW1.csv) and will
compute accuracy with the test data (MNIST_test_HW1.csv).
For Task 1, please follow the procedure:
1. Train a linear regression model
Find the optimal parameters (\theta ) by the Normal Equation: \theta =(X T X)1 XT y. It will cause an
error/warning.
You can use numpy.linalg.pinv or pinv from MATLAB for matrix inverse. The function
computes MoorePenrose pseudoinverse X+ which will further be used to find valid optimal
parameters using \theta *= X+ y. This will be a solution to rankdeficient degenerate system.
2. Display the optimal coefficients (denoted by \theta *)
3. Classify (binary) test data (MNIST_test.csv) with a threshold (<=) of 0.5 as described below:
y pred = X test \theta *
if y pred >0.5, class 1, otherwise 0
4. Display the % accuracy.
Task 2. Implementation of Gradient Descent with MNIST data
For Task 2, we will use the same data as Task 1. However, we will find the optimal coefficients by using
"Gradient Descent" algorithm. Then, we will compare with the solution that we found in Task 1.
The procedure of Task 2 is almost the same as Task 1, but need to implement "Gradient Descent"
algorithm, instead of a leastsquare minimization based solution of single line equation as (X T X)1 X T y or
more appropriately P inv .
For the Gradient Descent algorithm, please follow the procedure:
1. Set the initial coefficient to zeros (can be any random values though)
Think of what the dimension of the coefficient vector is?
2. Determine hyperparameters such as learning rate (\alpha ) and iteration numbers (k).
3. Run "gradient descent" algorithm with the hyperparameters and check "Learning Curve" as
shown:
* Learning curve shows whether it converges or not. Xaxis shows the number of iterations,
while yaxis shows cost (J).
* Learning curve must be showing as "converged", otherwise the solution may not be good.
4. Display the estimated coefficients (denoted by \theta ^)
5. Classify test data (MNIST_test.csv) with a threshold (<=) of 0.5 as described below:
ypred = Xtest \theta ^
if ypred >0.5, class 1, otherwise 0
6. Display the % accuracy.
7. Display the aggregate difference between \theta * and \theta ^ as defined below:
|^|
Submission:
Please submit the following to the D2L by the stipulated deadline therein:
1. Summary: MS Word or PDF file summarizing your results, and discussion only.
Describe what you did, and how, as well as your important results.
2. Code: one Python/MATLAB/Octave/R/Java/C++ original executable source code file, or a zipfile if
multiple code files with clear running ReadMe file.
Must be well organized (comments, indentation, ...)
3. Code PDF: also, upload the PDF version of the original source code file(s), in case we would just like
to have a look at the code but dont want to run it.
Please submit these THREE files SEPERATELY. DO NOT compress into a ZIP file.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!