Question: Hi! I really need help on this machine learning questions. Any fast help would be appreciated, on any of the parts. We derived a cost

Hi! I really need help on this machine learning

Hi! I really need help on this machine learning questions.

Any fast help would be appreciated, on any of the parts.

We derived a cost function for regression problems by assuming that sample points and their labels arise from the following process, and applying maximum likelihood estimation (MLE). Sample points come from an unknown distribution, X, D. Labels y, are the sum of a deterministic function g plus random noise: Vi, y = g(x) + 6, where - N(0,0). For this problem, we will assume that & N0.07) that is, the variance of of the noise is different for each sample point and we will examine how our cost function changes as a result. We assume that (magically) we know the value of each of You are given an nxd design matrix X, an t-vector y of labels, such that the label y, of sample point X is generated as described above, and a list of the noise variancesc (a) [8 pts Apply MLE to derive the optimization problem that will yield the maximum likelihood estimate of the distribution parameter g. (Note: g is a function, but we can still trcat it as the parameter of an optimization problem.) Express your cost function as a summation of loss functions (where you decide what the loss function is), one per sample point. (b) [4 pts) We decide to do linear regression, so we parameterize g(x) as g(x) = w. X., where w is a d-vector of weights Write an equivalent optimization problem where your optimization variable is w and the cost function is a function of X. y, w, and the variances Find a way to express your cost function in matrix notation, with no summations. This may entail defining a new matrix.) (e) [4 pts) Write the solution to your optimization problem as the solution of a linear system of equations. (Again, in matrix notation, with no summations.) (d) 12 pts Does your solution resemble that of a similar method you know? What is its name? (e) 12 pts Compure your solution to the case in which we assume that every sample point has the same noise distribution In simple terms, how does the amount of noise affect the optimization, and why does this secm like the intuntively night thing to do? Answer in 3 sentences or fewer. We derived a cost function for regression problems by assuming that sample points and their labels arise from the following process, and applying maximum likelihood estimation (MLE). Sample points come from an unknown distribution, X, D. Labels y, are the sum of a deterministic function g plus random noise: Vi, y = g(x) + 6, where - N(0,0). For this problem, we will assume that & N0.07) that is, the variance of of the noise is different for each sample point and we will examine how our cost function changes as a result. We assume that (magically) we know the value of each of You are given an nxd design matrix X, an t-vector y of labels, such that the label y, of sample point X is generated as described above, and a list of the noise variancesc (a) [8 pts Apply MLE to derive the optimization problem that will yield the maximum likelihood estimate of the distribution parameter g. (Note: g is a function, but we can still trcat it as the parameter of an optimization problem.) Express your cost function as a summation of loss functions (where you decide what the loss function is), one per sample point. (b) [4 pts) We decide to do linear regression, so we parameterize g(x) as g(x) = w. X., where w is a d-vector of weights Write an equivalent optimization problem where your optimization variable is w and the cost function is a function of X. y, w, and the variances Find a way to express your cost function in matrix notation, with no summations. This may entail defining a new matrix.) (e) [4 pts) Write the solution to your optimization problem as the solution of a linear system of equations. (Again, in matrix notation, with no summations.) (d) 12 pts Does your solution resemble that of a similar method you know? What is its name? (e) 12 pts Compure your solution to the case in which we assume that every sample point has the same noise distribution In simple terms, how does the amount of noise affect the optimization, and why does this secm like the intuntively night thing to do? Answer in 3 sentences or fewer

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related General Management Questions!