write down the updating equation in SGD for w and b, for both unregularized logistic regression...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
write down the updating equation in SGD for w and b, for both unregularized logistic regression (15 points]) and regularized logistic regression ([5 points]). In particular, at iteration t using one random data point (x, y) where x = .....id], and y; € {0, 1} is the label, how do we compute w+ and 6+1 from w' and bt? (Q2) [20 points] For step sizes = {0.001,0.01, 0.05,0.1,0.5) and without regularization, implement Stochastic Gradient Descent (without cross-validation). a. [5 points] Report the number of iterations (epochs) needed until convergence for every step size n; (fill in the table below in your report) b. [5 points] Report the L2 norm of vector w after 100 iterations (it may converge faster) for each step size Thi c. [5 points] Report the accuracy score in prediction of the training data after 100 iterations (it may converge faster) for each step size ni. d. [5 points] Plot the cross-entropy function value with respect to the number of steps (T = [1,..., 100]) for the training data for each step size. Remember: it may converge before 100 iterations. Fill in the table below in your report 7 (no regularization) # of epochs (iteration) till convergence L2 of weights norm Accuracy score on training data 0.001 0.01 0.05 0.1 0.5 (Q3) 45 points] For step sizes = {0.001, 0.01, 0.05, 0.1, 0.5) and with regularization coefficients A={0,0.05, 0.1, 0.15,..., 0.5), do the following: a. [5 points] Given A = 0.1, plot the cross-entropy function value with respect to the number of steps (T=[1.....100]) for the training data for each step size n; using different step sizes. Remember: it may converge before 100 iterations. b. [5 points] Given = 0.01, report the number of iterations (epochs) needed until convergence for every regularizer coefficient A, (fill in the table below in your report) c. [5 points] Given = 0.01, report the L2 norm of vector w after convergence for each regularization coefficient A; (fill in the table below). d. [5 points] Given = 0.01, report the accuracy score in prediction of the training data after 100 iterations (it may converge faster) for every regularization coefficient A. (fill in the table below). e. [25 points] Plot the cross-entropy function value at T-100 (or end of convergence) for different regularization coefficients A,, for both the training and test data. The x-axis will be the regularization coefficient and y-axis will be the cross-entropy function value after 100 iterations (It may converge faster than 100). Each plot should contain two curves, and you should make 5 plots (five different step sizes n )[each 5 points]. (with regularization A¡, 77 = 0.01) # of epochs (iteration) till convergence L2 norm of weights Accuracy score on training data 2.5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Analysis and Comparison of Gradient Descent for Logistic Regression (Q4) [10 points] Briefly (in no more than 4 sentences) explain your results from (Q2) and (Q3). You should discuss the rate of convergence as a function n, the change in magnitude of w as a function of A, the value of the cross-entropy function for different values of X and n, and any other interesting trends you have observed. write down the updating equation in SGD for w and b, for both unregularized logistic regression (15 points]) and regularized logistic regression ([5 points]). In particular, at iteration t using one random data point (x, y) where x = .....id], and y; € {0, 1} is the label, how do we compute w+ and 6+1 from w' and bt? (Q2) [20 points] For step sizes = {0.001,0.01, 0.05,0.1,0.5) and without regularization, implement Stochastic Gradient Descent (without cross-validation). a. [5 points] Report the number of iterations (epochs) needed until convergence for every step size n; (fill in the table below in your report) b. [5 points] Report the L2 norm of vector w after 100 iterations (it may converge faster) for each step size Thi c. [5 points] Report the accuracy score in prediction of the training data after 100 iterations (it may converge faster) for each step size ni. d. [5 points] Plot the cross-entropy function value with respect to the number of steps (T = [1,..., 100]) for the training data for each step size. Remember: it may converge before 100 iterations. Fill in the table below in your report 7 (no regularization) # of epochs (iteration) till convergence L2 of weights norm Accuracy score on training data 0.001 0.01 0.05 0.1 0.5 (Q3) 45 points] For step sizes = {0.001, 0.01, 0.05, 0.1, 0.5) and with regularization coefficients A={0,0.05, 0.1, 0.15,..., 0.5), do the following: a. [5 points] Given A = 0.1, plot the cross-entropy function value with respect to the number of steps (T=[1.....100]) for the training data for each step size n; using different step sizes. Remember: it may converge before 100 iterations. b. [5 points] Given = 0.01, report the number of iterations (epochs) needed until convergence for every regularizer coefficient A, (fill in the table below in your report) c. [5 points] Given = 0.01, report the L2 norm of vector w after convergence for each regularization coefficient A; (fill in the table below). d. [5 points] Given = 0.01, report the accuracy score in prediction of the training data after 100 iterations (it may converge faster) for every regularization coefficient A. (fill in the table below). e. [25 points] Plot the cross-entropy function value at T-100 (or end of convergence) for different regularization coefficients A,, for both the training and test data. The x-axis will be the regularization coefficient and y-axis will be the cross-entropy function value after 100 iterations (It may converge faster than 100). Each plot should contain two curves, and you should make 5 plots (five different step sizes n )[each 5 points]. (with regularization A¡, 77 = 0.01) # of epochs (iteration) till convergence L2 norm of weights Accuracy score on training data 2.5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Analysis and Comparison of Gradient Descent for Logistic Regression (Q4) [10 points] Briefly (in no more than 4 sentences) explain your results from (Q2) and (Q3). You should discuss the rate of convergence as a function n, the change in magnitude of w as a function of A, the value of the cross-entropy function for different values of X and n, and any other interesting trends you have observed.
Expert Answer:
Answer rating: 100% (QA)
Updating Equation in SGD for Unregularized Logistic Regression Given one random data point x y where ... View the full answer
Related Book For
Posted Date:
Students also viewed these programming questions
-
Ginter Co. holds Kolar Inc.s $10,000, 120-day, 9% note. The entry made by Ginter Co. when the note is collected, assuming no interest has been previously accrued, is: (a) Cash Notes Receivable (b)...
-
Consider another set of net cash flows: Year.................................... Cash flow 0...........................................$2,000 1.............................................2,000...
-
For the following exercises, assume is opposite side a, is opposite side b, and is opposite side c. If possible, solve each triangle for the unknown side. Round to the nearest tenth. = 43.1, a =...
-
A 20-ounce bottle of Dasani water typically costs about \($1.99\) at a convenience store. At an airport shop, that same bottle of Dasani water could cost \($2.89\) or more. The costs of operating...
-
Wechsler Company produces three products: A110, B382, and C657. All three products use the same direct material, Voxx. Unit data for the three products are: The demand for the products far exceeds...
-
11. The current in a metallic conductor is plotted against voltage at two different temperatures T and T2. Which is correct :- Current 2 (1) T Voltage (2) T
-
return and covariance matrix are given as follows. L2 Three securities are available for investment, and their expected rates of 3 = 0.12 0.02 0.02 2 01 012 2 021 02 013 023 2 031 032 03 = 0.04...
-
At the time Tom Cross took over as managing director at Powerdrive Motors in South Africa, the company was an established manufacturer of small electric motors with a strong reputation for product...
-
Solve the system of equations whose augmented matrix A is given by: 10-90 5 01 4 0-3 A = 00 0 1-7 10 0 0 02
-
A box (m) of mass 59.9 kg is pulled across a rough floor with a force (F) of 288 N. The acceleration was determined to be 1.2 m/s2. Determine the coefficient of friction Record value to nearest...
-
Number of rooms 110 Days in a year 365 Total number of rooms available per year ? Revenues per department Rooms 4,000,000 Food and beverages 1,500,000 Other departments 500,000 Total Revenue ?...
-
Determine the impact of the transaction below on the ASSET category of the Accounting Equation. Transaction: On January 1, SCS declares $1,000 dividends to shareholders, to be paid on March 1st....
-
You invest $ 7 , 5 0 0 , 0 0 0 in a commercial building that you place into service on 2 / 1 / 2 0 1 4 . How much can you deduct in depreciation for 2 0 1 5 And for 2 0 1 6 ?
-
Truckers Ltd bought two turbo-charged trucks from Freighters under a hire purchase agreement on 1" January 2008. Under the agreement, Truckers Ltd was to pay an initial deposit of 20% of the cash...
-
The unadjusted trial balance of Secretarial Services is as follows: SECRETARIAL SERVICES Unadjusted Trial Balance as at 31 December 2017 Account Debit Credit Cash at bank Office supplies Prepaid...
-
Implement the test in Problem 7.56 and provide a two-tailed p-value. Cardiovascular Disease, Pediatrics Left ventricular mass (LVM) is an important risk factor for subsequent cardiovascular disease....
-
Implement the test in Problem 10.34, and report a p-value? Mental Health A study was performed in Lebanon looking at the effect of widowhood on mortality [14]. Each of 151 widowers and 544 widows...
-
Suppose we specifically wish to compare women who used cod liver oil for the entire year during childhood vs. women who never used it. (i) Perform this comparison using the LSD method, and report a...
-
Which of the following advantages listed by Ryan with respect to the earnings-based approach studied by Atkinson is most likely correct? The model A. Can be used for non-U.S. equity markets. B....
-
1. Assume the S&P 500 forward earnings yield is 5 percent and the 10-year T-note yield is 4.6 percent. Are stocks overvalued or undervalued according to the Fed model? 2. Why might the earnings yield...
-
Data from which Tobins q and equity q can be calculated are published in the Flow of Funds Accounts of the United States-Z.1, published quarterly by the Federal Reserve.* This data source is...
Study smarter with the SolutionInn App