The goal of this problem is to implement the Stochastic Gradient Descent algorithm to build a...

Fantastic news! We've Found the answer you've been seeking!

Question:

The goal of this problem is to implement the Stochastic Gradient Descent algorithm to build a Latent Factor

3. Set k = 20, = 1, and the number of iterations to 40. Find a reasonable value for the learning rate,

Transcribed Image Text:

The goal of this problem is to implement the Stochastic Gradient Descent algorithm to build a Latent Factor Recommendation system. We can use it to recommend movie to users. Suppose that we have a matrix R of ratings where the element Ri,u is the rating given to item i by user u. The size of R is m x n, where m is the number of movies and n is the number of users. Note that most elements of the matrix are unknown/empty, since each user can only rate/view a small proportion of all of the movies. Our goal is to find two matrices P and Q so that R~ QPT where Q is mx k and P is n x k, where k will be parameter of our algorithm. The error metric we will use is: E = (ER Σ(Riu - qip!)3) + 1 Σ Pall3 + PT ² ) + x inu Σ|la|13) Where i~ j means that we only sum over entries where the user actually rated that item¹, q, is the ith row of Q,corresponding to an item, and pu is the uth row of P, corresponding to a user, so rhese are both vectors of size k. The regularization parameter is A and || - || is the sum of the squares of the vector entries. Complete the following steps: 1. If &i,u denotes the derivative of E with respect to Ri,u then Ei,u = 2(Ri,uqi pu) and the update equations for qi and pu in stochastic gradient descent are: Q1 = qi +n(i,upu - 2Xqi) Pu = Pu + n(Ei,uli - 2λpu) 2. Implement the algorithm using the updates described in the previous part. Read each entry of R from disk and update &i,u, qi, and pu for each entry.² 3. Set k = 20, A = , and the number of iterations to 40. Find a reasonable value for the learning rate, starting with n=. The error on the training set should be below 70,000 after 40 iterations and qi and p; should have converged. 3. Set k = 20, λ = 1, and the number of iterations to 40. Find a reasonable value for the learning rate, starting with n=1 The error on the training set should be below 70,000 after 40 iterations and q and p; should have converged. ¹that is, the entries in R that are known 2This means that you should not store R in memory. Instead, you should read each element sequentially and apply the update equations to each element at each iteration. Thus, each iteration will read the whole file. • If n is too large, the error value can converge to something too large or may not monotonically decrease (it can fail to converge) • If n is too small, the error function doesn't have time to decrease within 40 steps. 4. Use the dataset ratings.train.txt included with the assignment, which is formatted as a matrix R as described above. Plot the value of E as a function of the number of iterations for your value of n. Hints: • You might try to initialize P and Q to random values in [0,√√] so that qi · p² = [0,5]. V . In the update step q; and pu depend on each other. Compute the new values for each depending on all of the old values and then update both vectors at once. . E should be computed at the end of the full iteration, not elementwise while the matrices are being updated. The goal of this problem is to implement the Stochastic Gradient Descent algorithm to build a Latent Factor Recommendation system. We can use it to recommend movie to users. Suppose that we have a matrix R of ratings where the element Ri,u is the rating given to item i by user u. The size of R is m x n, where m is the number of movies and n is the number of users. Note that most elements of the matrix are unknown/empty, since each user can only rate/view a small proportion of all of the movies. Our goal is to find two matrices P and Q so that R~ QPT where Q is mx k and P is n x k, where k will be parameter of our algorithm. The error metric we will use is: E = (ER Σ(Riu - qip!)3) + 1 Σ Pall3 + PT ² ) + x inu Σ|la|13) Where i~ j means that we only sum over entries where the user actually rated that item¹, q, is the ith row of Q,corresponding to an item, and pu is the uth row of P, corresponding to a user, so rhese are both vectors of size k. The regularization parameter is A and || - || is the sum of the squares of the vector entries. Complete the following steps: 1. If &i,u denotes the derivative of E with respect to Ri,u then Ei,u = 2(Ri,uqi pu) and the update equations for qi and pu in stochastic gradient descent are: Q1 = qi +n(i,upu - 2Xqi) Pu = Pu + n(Ei,uli - 2λpu) 2. Implement the algorithm using the updates described in the previous part. Read each entry of R from disk and update &i,u, qi, and pu for each entry.² 3. Set k = 20, A = , and the number of iterations to 40. Find a reasonable value for the learning rate, starting with n=. The error on the training set should be below 70,000 after 40 iterations and qi and p; should have converged. 3. Set k = 20, λ = 1, and the number of iterations to 40. Find a reasonable value for the learning rate, starting with n=1 The error on the training set should be below 70,000 after 40 iterations and q and p; should have converged. ¹that is, the entries in R that are known 2This means that you should not store R in memory. Instead, you should read each element sequentially and apply the update equations to each element at each iteration. Thus, each iteration will read the whole file. • If n is too large, the error value can converge to something too large or may not monotonically decrease (it can fail to converge) • If n is too small, the error function doesn't have time to decrease within 40 steps. 4. Use the dataset ratings.train.txt included with the assignment, which is formatted as a matrix R as described above. Plot the value of E as a function of the number of iterations for your value of n. Hints: • You might try to initialize P and Q to random values in [0,√√] so that qi · p² = [0,5]. V . In the update step q; and pu depend on each other. Compute the new values for each depending on all of the old values and then update both vectors at once. . E should be computed at the end of the full iteration, not elementwise while the matrices are being updated.

Related Book For answer-question

answer-question

Numerical Methods With Chemical Engineering Applications

Numerical Methods With Chemical Engineering Applications

ISBN: 9781107135116

1st Edition

Authors: Kevin D. Dorfman, Prodromos Daoutidis

See More Books

Posted Date: Jan 19, 2024 04:56 AM