Question: Problem 2. 1. While we can formalize the Likelihood Function there is no close form expression for the coefcients 50,131 maximizing the above log-likelihood in

 Problem 2. 1. While we can formalize the Likelihood Function thereis no close form expression for the coefcients 50,131 maximizing the above

Problem 2. 1. While we can formalize the Likelihood Function there is no close form expression for the coefcients 50,131 maximizing the above log-likelihood in Problem 1. Hence, we will use an iterative algorithm to solve for the coefcients. We can see that 111 max(_ 2111(1 + 8-yi(150+1812|])) = [1111\": In (1 + e-mfo+izl))) i=1 i=1 We will describe our function loss as L 1mg"; in (1 + e \"'(+1\"')). Our objective is to iteratively decrease this loss as we keep computing the optimal coefcients. Here 1; E R In this problem we will be working with real image data where the goal is to clas- sify if the image is 0 or 1 using logistic regression. The input X E R m x d, is a matrix with dimensions [m x d], where a single data point 11:, 6 Rd with d = 784. The labels matrix Y E R'", where each label y,: E {0, 1} 0 Load the data into the memory and visualize one input as an image for each of label 0 and label 1. (The data should be reshaped back to [28 x 28] to be able to visualize it.) o The data is in between 0 to 255. Normalise the data to [0, 1] 0 Set y, = 1 for images labeled 0 and y, = -1 for images labeled 1. Split the data randomly into train and test with a ratio of 80:20. Why is random splitting better than sequential splitting in our case? 0 Initialize the coefcients using a univariate \"normal\" (Gaussian) distribution of mean 0 and variance 1. (Remember that coefcients are a vector of [130,131...,6d], where d is the dimension of the input) 0 Compute the loss using the above mentioned Loss L. (The loss can be written as L = :12"; ln(1+ e._""("r"'+zji1 =0 "U+1}x\")), where (i, j) represent the if\" data point, where i 6 {1,2, ..,m} and 3"" dimension of the data point 11:, forj E {0,...d 1}) a To minimize the loss function, a widely known algorithm is going in the direction opposite to the gradients of the loss function. (It's helpful to write the coefcients [131, ..., 135] as a vector 13, and g as a scalar. NowERdandoeR) We can write the gradients of loss function as a matrix operation 8 _yr'."'(160+18 2T) 3; = _ mZ 1+ e-tve \"(1%szin _ "'30 6L 1 m e-ve t60+ 2T) $= E :1 1 + e(ye'fod''wfll we, d6 Write a function to compute the gradients . Update the parameters as B = 3 - 0.05 * dB Bo = Bo - 0.05 * d Bo (Gradient updates should be computed based on the train set) . Repeat the process for 50 iterations and report the loss after the 50th epoch. . Plot the loss for each iteration for the train and test sets . Logistic regression is a classification problem. We classify as +1 if P(Y = 1|X) 2 0.5. Derive the classification rule for the threshold 0.5. (Not a programming question) . For the classification rule derived compute the accuracy on the test set for each iteration and plot the accuracy The final code should be along this format import numpy as np from matplotlib import pyplot as plt def compute_loss (data, labels, B, B_0): return logloss def compute_gradients (data, labels, B, B_0): return dB, dB_0 if _ _name__ == ' '_ _main_ _ ' : x = np . load (data) y = np . load (label) ## Split the data to train and test x_train, y_train, x_test, y_test = #split_data B = np . random . randn (1, x . shape [1] ) B_0 = np . random . randn(1) 1r = 0.05 for _ in range (50) : ## Compute Loss loss = compute_loss (x_train, y_train, B, B_0)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!