write down the updating equation in SGD for w and b, for both unregularized logistic regression...

Fantastic news! We've Found the answer you've been seeking!

Question:

write down the updating equation in SGD for w and b, for both unregularized logistic regression (/5 points])

a. [5 points] Given A = 0.1, plot the cross-entropy function value with respect to the number of steps (T=

Transcribed Image Text:

write down the updating equation in SGD for w and b, for both unregularized logistic regression (15 points]) and regularized logistic regression ([5 points]). In particular, at iteration t using one random data point (x, y) where x = .....id], and y; € {0, 1} is the label, how do we compute w+ and 6+1 from w' and bt? (Q2) [20 points] For step sizes = {0.001,0.01, 0.05,0.1,0.5) and without regularization, implement Stochastic Gradient Descent (without cross-validation). a. [5 points] Report the number of iterations (epochs) needed until convergence for every step size n; (fill in the table below in your report) b. [5 points] Report the L2 norm of vector w after 100 iterations (it may converge faster) for each step size Thi c. [5 points] Report the accuracy score in prediction of the training data after 100 iterations (it may converge faster) for each step size ni. d. [5 points] Plot the cross-entropy function value with respect to the number of steps (T = [1,..., 100]) for the training data for each step size. Remember: it may converge before 100 iterations. Fill in the table below in your report 7 (no regularization) # of epochs (iteration) till convergence L2 of weights norm Accuracy score on training data 0.001 0.01 0.05 0.1 0.5 (Q3) 45 points] For step sizes = {0.001, 0.01, 0.05, 0.1, 0.5) and with regularization coefficients A={0,0.05, 0.1, 0.15,..., 0.5), do the following: a. [5 points] Given A = 0.1, plot the cross-entropy function value with respect to the number of steps (T=[1.....100]) for the training data for each step size n; using different step sizes. Remember: it may converge before 100 iterations. b. [5 points] Given = 0.01, report the number of iterations (epochs) needed until convergence for every regularizer coefficient A, (fill in the table below in your report) c. [5 points] Given = 0.01, report the L2 norm of vector w after convergence for each regularization coefficient A; (fill in the table below). d. [5 points] Given = 0.01, report the accuracy score in prediction of the training data after 100 iterations (it may converge faster) for every regularization coefficient A. (fill in the table below). e. [25 points] Plot the cross-entropy function value at T-100 (or end of convergence) for different regularization coefficients A,, for both the training and test data. The x-axis will be the regularization coefficient and y-axis will be the cross-entropy function value after 100 iterations (It may converge faster than 100). Each plot should contain two curves, and you should make 5 plots (five different step sizes n )[each 5 points]. (with regularization A¡, 77 = 0.01) # of epochs (iteration) till convergence L2 norm of weights Accuracy score on training data 2.5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Analysis and Comparison of Gradient Descent for Logistic Regression (Q4) [10 points] Briefly (in no more than 4 sentences) explain your results from (Q2) and (Q3). You should discuss the rate of convergence as a function n, the change in magnitude of w as a function of A, the value of the cross-entropy function for different values of X and n, and any other interesting trends you have observed. write down the updating equation in SGD for w and b, for both unregularized logistic regression (15 points]) and regularized logistic regression ([5 points]). In particular, at iteration t using one random data point (x, y) where x = .....id], and y; € {0, 1} is the label, how do we compute w+ and 6+1 from w' and bt? (Q2) [20 points] For step sizes = {0.001,0.01, 0.05,0.1,0.5) and without regularization, implement Stochastic Gradient Descent (without cross-validation). a. [5 points] Report the number of iterations (epochs) needed until convergence for every step size n; (fill in the table below in your report) b. [5 points] Report the L2 norm of vector w after 100 iterations (it may converge faster) for each step size Thi c. [5 points] Report the accuracy score in prediction of the training data after 100 iterations (it may converge faster) for each step size ni. d. [5 points] Plot the cross-entropy function value with respect to the number of steps (T = [1,..., 100]) for the training data for each step size. Remember: it may converge before 100 iterations. Fill in the table below in your report 7 (no regularization) # of epochs (iteration) till convergence L2 of weights norm Accuracy score on training data 0.001 0.01 0.05 0.1 0.5 (Q3) 45 points] For step sizes = {0.001, 0.01, 0.05, 0.1, 0.5) and with regularization coefficients A={0,0.05, 0.1, 0.15,..., 0.5), do the following: a. [5 points] Given A = 0.1, plot the cross-entropy function value with respect to the number of steps (T=[1.....100]) for the training data for each step size n; using different step sizes. Remember: it may converge before 100 iterations. b. [5 points] Given = 0.01, report the number of iterations (epochs) needed until convergence for every regularizer coefficient A, (fill in the table below in your report) c. [5 points] Given = 0.01, report the L2 norm of vector w after convergence for each regularization coefficient A; (fill in the table below). d. [5 points] Given = 0.01, report the accuracy score in prediction of the training data after 100 iterations (it may converge faster) for every regularization coefficient A. (fill in the table below). e. [25 points] Plot the cross-entropy function value at T-100 (or end of convergence) for different regularization coefficients A,, for both the training and test data. The x-axis will be the regularization coefficient and y-axis will be the cross-entropy function value after 100 iterations (It may converge faster than 100). Each plot should contain two curves, and you should make 5 plots (five different step sizes n )[each 5 points]. (with regularization A¡, 77 = 0.01) # of epochs (iteration) till convergence L2 norm of weights Accuracy score on training data 2.5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Analysis and Comparison of Gradient Descent for Logistic Regression (Q4) [10 points] Briefly (in no more than 4 sentences) explain your results from (Q2) and (Q3). You should discuss the rate of convergence as a function n, the change in magnitude of w as a function of A, the value of the cross-entropy function for different values of X and n, and any other interesting trends you have observed.

Related Book For answer-question

answer-question

Fundamentals Of Biostatistics

Fundamentals Of Biostatistics

ISBN: 9781305268920

8th Edition

Authors: Bernard Rosner

See More Books

Posted Date: Dec 16, 2023 12:03 AM

See More Questions