Question: You're given a dataset (X_1,Y_1 ), ..., (X_n, Y_n) which are i.i.d. with unknown distribution, but you do know thet X and Y are positively

You're given a dataset (X_1,Y_1 ), ..., (X_n, Y_n) which are i.i.d. with unknown distribution, but you do know thet X and Y are positively (but not perfectly) correlated. In linear regression, we fit a line y=a+bx and find (a,b) that minimizes some loss function. The usual choice of loss function is the mean squared vertical distance from the line of best fit (i.e. (Y_i a bX_i )^2 ); let the resulting parameters be (a_1,b_1 ). Another possible choice of loss function is the mean squared horizontal distance from the line of best fit (i.e. (X_i(Y_ia)/b)^2 ; let the resulting parameters be (a_2,b_2 ). 3(a) Suppose we observe the data points (1,1), (2,5), (3,4), (4,7), (5,9). What are the values of b_1 and b_2 for this dataset? (Hint: for b_2, consider performing LSE on (Y_1,X_1 ), ..., (Y_n, X_n ); the slope obtained here is the reciprocal of b_2.)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!