The goal of this problem is to implement the Stochastic Gradient Descent algorithm to build a...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
The goal of this problem is to implement the Stochastic Gradient Descent algorithm to build a Latent Factor Recommendation system. We can use it to recommend movie to users. Suppose that we have a matrix R of ratings where the element Ri,u is the rating given to item i by user u. The size of R is m x n, where m is the number of movies and n is the number of users. Note that most elements of the matrix are unknown/empty, since each user can only rate/view a small proportion of all of the movies. Our goal is to find two matrices P and Q so that R~ QPT where Q is mx k and P is n x k, where k will be parameter of our algorithm. The error metric we will use is: E = (ER Σ(Riu - qip!)3) + 1 Σ Pall3 + PT ² ) + x inu Σ|la|13) Where i~ j means that we only sum over entries where the user actually rated that item¹, q, is the ith row of Q,corresponding to an item, and pu is the uth row of P, corresponding to a user, so rhese are both vectors of size k. The regularization parameter is A and || - || is the sum of the squares of the vector entries. Complete the following steps: 1. If &i,u denotes the derivative of E with respect to Ri,u then Ei,u = 2(Ri,uqi pu) and the update equations for qi and pu in stochastic gradient descent are: Q1 = qi +n(i,upu - 2Xqi) Pu = Pu + n(Ei,uli - 2λpu) 2. Implement the algorithm using the updates described in the previous part. Read each entry of R from disk and update &i,u, qi, and pu for each entry.² 3. Set k = 20, A = , and the number of iterations to 40. Find a reasonable value for the learning rate, starting with n=. The error on the training set should be below 70,000 after 40 iterations and qi and p; should have converged. 3. Set k = 20, λ = 1, and the number of iterations to 40. Find a reasonable value for the learning rate, starting with n=1 The error on the training set should be below 70,000 after 40 iterations and q and p; should have converged. ¹that is, the entries in R that are known 2This means that you should not store R in memory. Instead, you should read each element sequentially and apply the update equations to each element at each iteration. Thus, each iteration will read the whole file. • If n is too large, the error value can converge to something too large or may not monotonically decrease (it can fail to converge) • If n is too small, the error function doesn't have time to decrease within 40 steps. 4. Use the dataset ratings.train.txt included with the assignment, which is formatted as a matrix R as described above. Plot the value of E as a function of the number of iterations for your value of n. Hints: • You might try to initialize P and Q to random values in [0,√√] so that qi · p² = [0,5]. V . In the update step q; and pu depend on each other. Compute the new values for each depending on all of the old values and then update both vectors at once. . E should be computed at the end of the full iteration, not elementwise while the matrices are being updated. The goal of this problem is to implement the Stochastic Gradient Descent algorithm to build a Latent Factor Recommendation system. We can use it to recommend movie to users. Suppose that we have a matrix R of ratings where the element Ri,u is the rating given to item i by user u. The size of R is m x n, where m is the number of movies and n is the number of users. Note that most elements of the matrix are unknown/empty, since each user can only rate/view a small proportion of all of the movies. Our goal is to find two matrices P and Q so that R~ QPT where Q is mx k and P is n x k, where k will be parameter of our algorithm. The error metric we will use is: E = (ER Σ(Riu - qip!)3) + 1 Σ Pall3 + PT ² ) + x inu Σ|la|13) Where i~ j means that we only sum over entries where the user actually rated that item¹, q, is the ith row of Q,corresponding to an item, and pu is the uth row of P, corresponding to a user, so rhese are both vectors of size k. The regularization parameter is A and || - || is the sum of the squares of the vector entries. Complete the following steps: 1. If &i,u denotes the derivative of E with respect to Ri,u then Ei,u = 2(Ri,uqi pu) and the update equations for qi and pu in stochastic gradient descent are: Q1 = qi +n(i,upu - 2Xqi) Pu = Pu + n(Ei,uli - 2λpu) 2. Implement the algorithm using the updates described in the previous part. Read each entry of R from disk and update &i,u, qi, and pu for each entry.² 3. Set k = 20, A = , and the number of iterations to 40. Find a reasonable value for the learning rate, starting with n=. The error on the training set should be below 70,000 after 40 iterations and qi and p; should have converged. 3. Set k = 20, λ = 1, and the number of iterations to 40. Find a reasonable value for the learning rate, starting with n=1 The error on the training set should be below 70,000 after 40 iterations and q and p; should have converged. ¹that is, the entries in R that are known 2This means that you should not store R in memory. Instead, you should read each element sequentially and apply the update equations to each element at each iteration. Thus, each iteration will read the whole file. • If n is too large, the error value can converge to something too large or may not monotonically decrease (it can fail to converge) • If n is too small, the error function doesn't have time to decrease within 40 steps. 4. Use the dataset ratings.train.txt included with the assignment, which is formatted as a matrix R as described above. Plot the value of E as a function of the number of iterations for your value of n. Hints: • You might try to initialize P and Q to random values in [0,√√] so that qi · p² = [0,5]. V . In the update step q; and pu depend on each other. Compute the new values for each depending on all of the old values and then update both vectors at once. . E should be computed at the end of the full iteration, not elementwise while the matrices are being updated.
Expert Answer:
Answer rating: 100% (QA)
This is a machine learning problem using the Stochastic Gradient Descent SGD algorithm to factorize a ratings matrix R using latent factor models for a recommendation system The task is to find two ma... View the full answer
Related Book For
Numerical Methods With Chemical Engineering Applications
ISBN: 9781107135116
1st Edition
Authors: Kevin D. Dorfman, Prodromos Daoutidis
Posted Date:
Students also viewed these programming questions
-
7. In reference to selection of activities (sequencing and progression) and overall safety guidelines, what would you recommend? 8. What are the differences between constructive and actual notice? 9....
-
Do laws provide a complete guide to ethical behavior? Can an activity be legal but not ethical?
-
Describe how Uber's business model works and the role technology has played in its success. What are the arguments for banning these types of businesses? What are the arguments for defending them?
-
The Moncton division of Glencoe Corporation, operating at capacity, has been asked by the Antigonish division of Glencoe to supply it with Electrical Fitting No. LX29. Moncton sells this part to its...
-
The following table shows data from a dehumidification process using aqueous solution of lithium chloride ( \(\mathrm{LiCl}\) ) in a packed tower. The desiccant leaving the dehumidifier is passed...
-
Quake Corporation paid $1,680,000 for a 30 percent interest in Tremor Corporation's outstanding voting stock on January 1, 2011. The book values and fair values of Tremor's assets and liabilities on...
-
Assume you are a staff associate with GKN Associates assigned to the Mystery, Inc. engagement. John Cole, the senior associate on the job, has asked for your assistance in performing journal entry...
-
Consider the following activities and their durations. The original project schedule, using early activity starts, is shown in Figure 11.20. Reconfigure the network using critical chain project...
-
A vertical 150 mm pipe leaves a water tank of surface elevation 24. Between the tank and elevation 12, on the line, 2.4 m of the head is lost when 56 Fs flow through the line. If an open piezometer...
-
You are requested to design a local road system in a residential area located in a mountainous terrain in New Nablus Suburban area. Area is restricted but need to have two traffic lanes and a parking...
-
1. (a) Let A = [ 0 Find all 2x2 matrices B which satisfy the relation AB = BA. (b) Use row reduction and back-substitution to solve the following system of equa- tions: 1 -3 1 2x+y+z x + 2y + z 2x +...
-
What are the contract specification of a German Bund future traded on Eurex (Bloomberg code RX1 Comdty): Nominal value of 1 contract Value of 1 point Tick size Tick value Price Contract Value...
-
The figure below shows the 60F water flow rates from the branches of a main supply line. Find the total pressure drop (PAPE) for soldered copper pipe. The LAB = 70 ft, LBC = 160 ft, and LcD = 100 ft....
-
Two identical dipoles A and B are oriented on the x-axis as shown. Each dipole consists of two charges +q and-q (q> 0) whose center is a distance d from the origin, held apart by a rod of length...
-
1.) A lap wound long shunt generator has 4 field magnets, which supplies a load of power 34KW, the armature resistance of the motor is given as 0.04 ohm and field resistance in series with armature...
-
Figure displays a 12.0 V battery 3 four uncharged capacitors of capacitances C1 = 4.00F, C2 = 6.00F, and C3 = 3.00F. The switch is thrown to the left side until capacitor 1 is fully charged. Then the...
-
We would like to use centered finite differences and the method of lines to solve the unsteady diffusionreaction problem subject to no-flux on the left boundaries, c/x = 0 at x = 0, a constant...
-
Use the 1-norm to determine the condition number of A: || 1 2 1 1 12 402 1 22
-
Develop a MATLAB code to solve for an arbitrary number of nodes. dy dx = -2y(1-2y), y(-1)=-1, y(1) = 1 (6.3.15)
-
The W14 \(\times 26\) structural A-36 steel member is used as a 20 -ft-long column that is assumed to be fixed at its top and fixed at its bottom. If the 15-kip load is applied at an eccentric...
-
The W14 \(\times 26\) structural A-36 steel member is used as a column that is assumed to be fixed at its top and pinned at its bottom. If the 15-kip load is applied at an eccentric distance of 10...
-
Determine the maximum eccentric load \(P\) the 2014-T6aluminum-alloy strut can support without causing it either to buckle or yield. The ends of the strut are pin connected. a $150 mm 150 mm 100 mm...
Study smarter with the SolutionInn App