Question: Problem 2. Variations of the Mean Squared Error Suppose you have a data set y1,y2, . . . ,yn with at least three values, 11

Problem 2. Variations of the Mean Squared Error Suppose you have a data set y1,y2, . . . ,yn with at least three values, 11 2 3, and the values are arranged such that 311 S :92 S S yu- We know from class that the mean of the data minimizes mean squared error, in 359(k) = Zn M)? i=1 In this problem, we'll consider some variations of this risk function. a) Q {:5 Dene a new function that considers only the two extreme points: EU) = (h - 2002 + (h - at)? What value of h minimizes E05)? We'll call the value of k that minimizes E(h) the extreme mean, since it's based on the extreme data values. Dene a new function that weights larger data points less heavily: b) (338' S(h) = (Em 3102) + 0.5- (h yn_1)2 + 0.1 - (h 3;\")? i=1 What value of h minimizes 80:)? We'll call the value of h that minimizes S (h) the sloped mean, since the coefcients of the data values decrease for larger data. C) {:5 Q Which do you think is a better hypothesis, the mean or the sloped mean? Is your answer always the same, or does it depend on some property of the data set? Give an example of when you might prefer to use the sloped mean, and when you might prefer the (regular) mean. i d) {:3 5 Find a function P(h), a variant of the mean squared error quUa), such that PU) is minimized d a 0.7-y1+0.8-y2+2yi h: i=3 n0.5 Hint: Look closely at the work you did in part (b)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
