Question: Homework 2 In this homework you are going to implement PCA from scratch We are going to use the following dataset for our examples *****

Homework 2

In this homework you are going to implement PCA from scratch

We are going to use the following dataset for our examples

*****

Dataset

import numpy as np import matplotlib.pyplot as plt np.random.seed(0) X = np.random.multivariate_normal(mean = np.array([2,3]), cov = np.array([[2,1],[1,1]]),size = 100) plt.plot(X[:,0],X[:,1],"*b") plt.grid()

****

Problem 1

First normalize the each variable (column) by subtracting its mean and dividing by it standart deviation as follows:

Homework 2 In this homework you are going to implement PCA from

your function should take input matrix X as input and should return normalized matrix, mean and standart deviation vector as output

def normalize_data(X): pass

# Problem 1 example X_norm, mu, sigma = normalize_data(X)

print(X_norm[0]) print(mu) print(sigma) plt.plot(X_norm[:,0],X_norm[:,1],"*b") plt.grid()

[-1.74865957 -1.32485349] [1.95492653 3.07587784] [1.43707517 1.03112934]

Problem 2

Find the eiegnvalues and eiegnvectors of the covariance matrix. You can use np.cov() function for covariance and numpy.linalg.eig() function to get eigen values.

def eigen(X): pass

# Problem 2 example

eigen_values, eigen_vectors = eigen(X)

print(eigen_vectors) print(eigen_values) plt.plot(X_norm[:,0],X_norm[:,1],"*b") plt.grid() plt.plot([0,eigen_vectors[0,0]],[0,eigen_vectors[1,0]],"r") plt.plot([0,eigen_vectors[0,1]],[0,eigen_vectors[1,1]],"r") plt.axis('square') plt.show()

[[ 0.84500497 -0.53475846] [ 0.53475846 0.84500497]] [2.76215622 0.39785666]

Problem 3

Using the following formulation calculate the transformed values which are coordinates on the new basis as follows

scratch We are going to use the following dataset for our examples

where B consists of eigenvectors corresponds the largest mm eigenvalues.

In the formulation above we assumed that columns corresponds to observations and rows corresponds to variables. So if your input matrix has rows as observations and columns as variables you may want to take the transpose of input matrix X as follows:

***** Dataset import numpy as np import matplotlib.pyplot as plt np.random.seed(0) X

def transform(X, eigen_values, eigen_vectors, m): pass

#Problem 3 example

X_transformed = transform(X_norm, eigen_values, eigen_vectors, 2) print(X_transformed[0])

[-2.18610263 -0.18439728]

Problem 4

Put all the steps together in a function. You will use matrix X and the number of components m as inputs and the transformed matrix as output

def pca(X, m): pass

# Problem 4 example from sklearn.datasets import load_iris

X = load_iris()["data"]

X_transformed = pca(X, 2) print(X_transformed[0])

[-2.26470281 -0.4800266 ]

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Homework In this homework you are going to implement functions necessary for linear regression problem. Consider the following dataset. ******* Dataset: import numpy as np import matplotlib.pyplot as...

Jupiter Notebook We have covered some of the limitations of single layer neural networks in class, but they are still powerful learning systems that provide a good way to begin learning about how to...

I need some serious help with this decision tree algorithm for my machine learning class. Be sure to read the code requirements below, because i need to write some parts from scratch. Any and all...

Hi, Can you please help me with assignment, I am failing to create the train_nn function. Please advise how I can get data to you, my previous efforts have failed. Tensorflow_NeuralNetworkspdf May 1,...

Write Python code to solve this homework in detail with comments. eg of csv file contain: AREA Description AGR The course aims to introduce Rules and Regulations that are designated for undergraduate...

****HW2 (question and solution) for the dataset. In HW2 the dataset used is from HW1 both the questions are listed below. (question and solution). All the HWs are linked. there is no missing...

Answer using sci-kit learn here is the dataset https://www.kaggle.com/rush4ratio/video-game-sales-with-ratings In this part, you will work with scikit-learn, an industry standard package for machine...

We can use the NLTK stopwords as below: [ ] import nltk from nltk.corpus import stopwords nltk.download('stopwords') sw_list = stopwords.words('english') sw_list[:10] # show some examples [nltk_data]...

Code: Nearest neighbor for handwritten digit recognition In this notebook we will build a classifier that takes an image of a handwritten digit and outputs a label 0-9. We will look at a particularly...

Suppose you are tasked with creating a program to provide the data for a housing market using machine learning methods. The data submitted to you are grouped in the two text files "test.txt" and...

Open the Excel file Store and Regional Sales database. a. Sort the data by units sold, high to low b. Sort the units sold using an icon set, where green corresponds to high sales levels, yellow to...

For the salary numbers, separated according to gender: a. Construct a histogram for just the males. b. Construct a histogram for just the females using the same scale as in part a to facilitate...

Every member of the File class is static - meaning methods are called _ _ _ _ . a . using the static keywordb.with the class namec.with the file named.with an object name

SIMAD UNIVERSITY Class: BACC25 Subject: Islamic Accounting Instructions: a) Follow The Instructions. Midterm Exam Instructor: All Ibrahim Date: 6-4-2022 b) You Have 1.5 Hrs. To Complete This Test. c)...

Analyse and critically Evaluate Dorian executives decisions concerning the selection of crew, by examining the cultural compatibility.

What expatriate behavior is best for Japanese managers who are sent to Germany?

Identify the difficulties in communication between bidder managers and target employees that are apparent in the case study?