Question: I have the first algorithm, but I need help on the gradient descent. Please help and explain In this problem, we will compare the performance

I have the first algorithm, but I need help on the gradientI have the first algorithm, but I need help on the gradient descent. Please help and explain

In this problem, we will compare the performance of three different types of algorithms on a synthetic training set. First, to generate the training set, pick a weight vector w e R10 of dimension 10 at random (normalize it so that its Euclidean norm ||0|| = VE;w; is equal to 1, i.e., pick w at random (e.g. each entry from N(0, 1)) and then take w/||w||). Then generate a training set of size m of the form {(x+, y),..., (xm, ym)} where each xi e R10 is a random vector of dimension 10 and each entry of x' is chosen from a Gaussian N(0, 1) (you may use built-in methods for this). The label y is 0 or +1 and should be randomly chosen such that y = +1 with probability exactly o(wx), where o is the sigmoid function, aka the standard logistic function, and y' = 0 otherwise. The goal is to learn w. 10 ANI 10 Manmadanan S7A11 mm 11an111 . Jawy maandamana Algorithm 2 is gradient descent where you train a model of the form o(w'.x) (with parameter w') with respect to square loss, i.e. the loss function is (o(w'. x) y)?, averaged over the points in the training set (code this up yourself, including calculating the gradient). Algorithm 3 is stochastic gradient descent again with respect to square loss, where during each iteration we use the gradient at one random point from the training set

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!