(a) Consider minimizing the function f(w)=w - 5w+3 using gradient descent (graph of f shown below)....
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
(a) Consider minimizing the function f(w)=w² - 5w+3 using gradient descent (graph of f shown below). Say the current value of w is equal to 5. What is the derivative of f at this point? (show your work) If our step size a = -0.1, what is the new value of w after this single gradient descent update? (b) Say we are part way through SGD for a linear regression problem with p= 1. The current weights are woand w₁=-. Next we are analyzing (z. y.)= (-1,2). If a = 0.1, what are the new weights after considering this point? Show work for full credit: Wo Explain your result geometrically (include a sketch of the model and a discussion of cost): (c) Assuming n training examples, p features, and T iterations needed for convergence, how long does it take to compute the stochastic gradient descent solution (i.e. w) for linear regression? Answer using big-O notation and briefly explain your reasoning for full credit. In this question we will analyze SGD for a simple (meaning p = 1) linear regression problem. We have n = 2, where (ri, y1)= (2, 1) and (x2, 32)= (1, -1) (plotted below), and wish to fit a linear model to this data. (Note the different scales on z and y below.) . (X2.Y2) = (1,-1) (d) Before we begin SGD, we will set wo - w₁0. At this point, what is the numerical value of the cost function for linear regression? Our cost function for linear regression is: J(w) == • (X₁,Y₁) = (2,1) n (hw (x₁) - y)² (e) For SGD we will use a = 1 (learning rate). Using (21, 31), compute the SGD updates (show your work) to find new values for wo and w₁. Use these new values to draw the current linear model on the plot above and label it M₁ (model 1). (a) Consider minimizing the function f(w)=w² - 5w+3 using gradient descent (graph of f shown below). Say the current value of w is equal to 5. What is the derivative of f at this point? (show your work) If our step size a = -0.1, what is the new value of w after this single gradient descent update? (b) Say we are part way through SGD for a linear regression problem with p= 1. The current weights are woand w₁=-. Next we are analyzing (z. y.)= (-1,2). If a = 0.1, what are the new weights after considering this point? Show work for full credit: Wo Explain your result geometrically (include a sketch of the model and a discussion of cost): (c) Assuming n training examples, p features, and T iterations needed for convergence, how long does it take to compute the stochastic gradient descent solution (i.e. w) for linear regression? Answer using big-O notation and briefly explain your reasoning for full credit. In this question we will analyze SGD for a simple (meaning p = 1) linear regression problem. We have n = 2, where (ri, y1)= (2, 1) and (x2, 32)= (1, -1) (plotted below), and wish to fit a linear model to this data. (Note the different scales on z and y below.) . (X2.Y2) = (1,-1) (d) Before we begin SGD, we will set wo - w₁0. At this point, what is the numerical value of the cost function for linear regression? Our cost function for linear regression is: J(w) == • (X₁,Y₁) = (2,1) n (hw (x₁) - y)² (e) For SGD we will use a = 1 (learning rate). Using (21, 31), compute the SGD updates (show your work) to find new values for wo and w₁. Use these new values to draw the current linear model on the plot above and label it M₁ (model 1).
Expert Answer:
Answer rating: 100% (QA)
a To find the derivative of the function f w w 25 w 3 at a point where w 5 we can differentiate it with respect to w f w 2 w 5 Now lets calculate the derivative at w 5 f 52551055 So the derivative of ... View the full answer
Related Book For
Microeconomics An Intuitive Approach with Calculus
ISBN: 978-0538453257
1st edition
Authors: Thomas Nechyba
Posted Date:
Students also viewed these programming questions
-
Planned Order Receipt Planned Order Release Gross Requirements for D-239 Ending Inventory Net requirements Planned Order Receipt Planned Order Release Gross Requirements for E-239 Ending Inventory...
-
Planning is one of the most important management functions in any business. A front office managers first step in planning should involve determine the departments goals. Planning also includes...
-
Perform the same computation as Sec. 24.1, but compute the amount of heat requited to raise the temperature of 1200 g of the material from -150 to 100C. Use Simpsons rule for your computation, with...
-
Are there some things that should be done only by face-to-face teams, not virtual ones?
-
Find the indicated Trapezoid Rule approximations to the following integrals. using n = 2, 4, and 8 subintervals 10 2x dx
-
Jack DeCoster owned Quality Egg, LLC, an Iowa egg production company. Jacks son, Peter DeCoster, served as the companys chief operating officer. Jack also owned and operated several egg production...
-
Installment-Sales Computations and Entries Presented below is summarized information for Johnston Co., which sells merchandise on the installment basis. (a) Compute the realized gross profit for each...
-
Consider an object falling through air, where po is the density of the object, Pair is the density of the air, m is the mass of the object, A is the effective cross-sectional area of the object, or...
-
Ann Riat - Salary Comparison The director of the company has asked you to confirm with Ann if she prefers to keep her annual salary of $64,000 and agrees to receive no commission on future...
-
(a) At what speed (in m/s) will a proton move in a circular path of the same radius as an electron that travels at 7.55 x 106 m/s perpendicular to the Earth's magnetic field at an altitude where the...
-
1. When products or services are first introduced into a market, they are often marketed as innovative and sold at a premium price. Once the product or service becomes established in a market, how is...
-
Given Table below Answer the question. 6. The accounts receivable age analysis for the Taylor Trading Company on December 31, 20-1, shows the following totals: Days Overdue Balance Current 1-30 31-60...
-
HNode.java Instructions: HNode.java - node containing a House object. (total of 120 points distributed as shown below) Note that the method UML description is listed below and it should be...
-
Why is this relevant? What does it mean? Use formulas, functions, and/or graphs/charts. Also include your conclusion about your data and how you used Excel to convey that information (150- 200 words)...
-
Discuss some of the challenges encountered by leaders when managing diversity and how diversity helps organizations better compete in global markets. Develop an effective business strategy to address...
-
Entity E acquires an asset at January 1, 20X1 for CU100,000. At that date, the asset is estimated to have a useful life of 20 years. The asset will be depreciated using the straight-line method. At...
-
Suppose that you could invest in the following projects but have only $30,000 to invest. How would you make your decision and which projects would you invest in? Project Cost $ 8,000 11,000 9,000...
-
Suppose that you have tastes for grits and other goods (where the price of other goods is normalized to 1). Assume throughout (unless otherwise stated) that your tastes are quasilinear in grits. A:...
-
Governments often interfere in markets by placing restrictions on the price that firms can charge. One common example of this is so-called anti-price gauging laws that restrict profits for firms when...
-
We have said that economic profit is equal to economic revenue minus economic costwhere cash inflows or outflows are not real economic revenues or costs unless they are in fact impacted by the...
-
(a) The longitudinal data set "v4c" is in the vertical format; transform it into the horizontal format. (b) Transform the data set you obtained in part (a) back into the vertical format.
-
Perform some exploratory analysis on the DTS study described above. (a) Compute the mean and standard deviation of the HamD scores for the two treatment groups at each time point. (b) Treat repeated...
-
Plot the mean/SD of HIV knowledge of adolescent girls at baseline and three months post treatment stratified by treatment for the Sexual Health study.
Study smarter with the SolutionInn App