Question: 3. (10 points) Recall from calculus that given some function g(:1:), the :r you get from solving dig) = 0 is called a criticaf point

 3. (10 points) Recall from calculus that given some function g(:1:),the :r you get from solving dig\") = 0 is called a

3. (10 points) Recall from calculus that given some function g(:1:), the :r you get from solving dig\") = 0 is called a criticaf point of g this means it could be a minimizer or a maximizer for g. In this question, we will explore some basic properties and build some intuition on why, for certain loss functions such as squared L2 loss, the critical point of the empirical risk function (dened as an average loss on the observed data) will always be the minimizer. Given some linear model f(l') = (39$ for some real scalar If?1 we can write the empirical risk of the model f given the observed data {11:31, :91}, for 2' E {11 . . . ,n} as the average L2 loss, also known as Mean Squared Error (MSE): Z(y -93:-) 222%(yz 9$)2 (c) (3 points) Now that we have shown that each term in the summation of the MSE is a convex function, one might wonder if the entire summation is convex, given that it is a sum of convex functions. Let's look at the formal denition of a convex function. Algebraically speaking, a function 9(9) is convex if for any two points (9,, 9(9,)) and (9,, 9(9,)) on the function, 9(c X 9, + (1 c) X 9,) S c X 9(9,) + (1 c) x 9(9j) for any real constant 0 S c S 1. Intuitively, the above denition says that, given the plot of a convex function 9(9), if you connect 2 randomly chosen points on the function, the line segment will always lie on or above 9(9) (try this with the graph of 9(9) 2 92). i. (2 points) Using the denition above, show that if 9(9) and h(9) are both convex functions, their sum 9(9) + h(9) will also be a convex function. ii. (1 point) Based on what you have shown in the previous part, explain intu itively why a (nite) sum of n convex functions is still a convex function when n > 2. (d) (2 points) Finally, explain why in our case that, when we solve for the critical point of the MSE by taking the gradient with respect to the parameter and setting the expression to 0, it is guranteed that the solution we nd will minimize the MSE

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!