Curve Fitting with Polynomials as a Pattern Learning problem In the polynomial basis, the function can be

Fantastic news! We've Found the answer you've been seeking!

Question:

Curve Fitting with Polynomials as a Pattern Learning problem

•In the polynomial basis, the function can be given as a linear combination of the polynomial basis functions:

y(x, w) =w₀·1 +w₁x+w₂x² + …… w_Mx^M

•M is the order of the polynomial and the coefficients w0,w1, w2 , wM are collectively denoted by the vector w, or the weight vector.

y(x, w) = w^T x Where x^T = (1, x, x2, …….xM ) is the augmented vector.

•Functions, such as the linear combination of the polynomial basis functions are linear in the weight vector have important properties and are called Linear Models.

Linear Model Determination

•The question now is: how do we determine the weight vector for this Linear Model?

•Values of the weight vector, w, can be determined by finding the weight vector that minimizes the error between the function

y(x, w) and the training set of data points, t.

•One simple choice of the error function, which is widely used, is the sum of squares of the errors between the predictions y(x_n, w) for each data point x_n and the corresponding training values t_n.

• The error function that is minimized is given by:

E(w) =Sn {y(x_ⁿ, w) – t_n}2

Weight Vector Determination

•The sum of the squared error function

E(w) =Sn {y(xn, w) – tn}2

is a nonnegative quantity that would be zero if and only if the function y(xn, w) were to pass through each data point.

•Otherwise the minimum is found by finding the value of w where the partial derivative of E(w) with respect to w is equal to zero.

•Because E(w) is a quadratic function, the minimum, the partial derivatives of E(w) with respect to each of the weights of the weight vector, w, will be a set of linear equations that can be solved for w, which is called w*

•The optimal function that fits the data is therefore y(x, w*)

•There remains the problem of choosing the dimension M of the subspace generated by the polynomial basis, which is an important problem called model selection.

Above is the lecture from the PowerPoint that was given.

Assume the following patterns are given:

x1 = (0, 0)^T x2 = (1, 0.8)^T x3 = (2, 2.2)^T x4 = (3, 2.8)^T x5 = (4, 4.5 0)^T

Find the best fitting straight line to the data that minimizes the mean square error (Derive the equations)
Find the best fitting polynomial of degree 2 to the data that minimizes the mean square error (Derive the equations)
What happens to (b) if the squared error is regularized using the norm squared of x.