Download the following data set. The file contains a matrix of user movie reviews. Each row...
Fantastic news! We've Found the answer you've been seeking!
Question:
![image text in transcribed](https://s3.amazonaws.com/si.experts.images/answers/2024/05/6647fe1f0c181_1426647fe1ee63ea.jpg)
![image text in transcribed](https://s3.amazonaws.com/si.experts.images/answers/2024/05/6647fe1f6b75f_1436647fe1f51fba.jpg)
Transcribed Image Text:
Download the following data set. The file contains a matrix of user movie reviews. Each row represents an individual user and each column represents a different movie. The value within the cell in the ith row and jth column represents user 's review on the jth movie. The matrix is sparse, as not reviewers did not review every movie in the data. Thus, there is missing data. The first row of the matrix is different from the rest of the rows in that it represents the genre of the movie. You may wish to remove the first row from the rest of the data, but keep it stored separately. a. Examine the data in the matrix. What is the highest review given? What is the lowest review? What is the overall average review? b. Examine the preferences of the individual 1462 (indexed by 1460 after removing the rows representing the movie names and genres, this individual's first five reviews should be 4-Toy Story, 3-Jumanji, 2.5- Grumpier Old Men, missing, 3-Father of the Bride). What genre of movie does this individual tend to give review scores of 5? How does this individual differ individual 45 (indexed by 43, this individual's first six reviews should be 2.5-Toy Story, missing, missing, missing, missing, 4-Heat)? What type of movie does this individual rate highly? For each individual, list at least two films that the user gave the highest possible score. c. Without performing any computations or using the data, estimate how many types" of individuals you would expect to find in this data in terms of genre preferences. Justify your answer. (This question will graded, but any answer given proper justification will be accepted. The correctness of your future answers will be based on your response to this question.) d. Replace the missing values in the matrix with zeros. Using SVD, perform matrix completion on the review matrix using a value of K equal to the number of types" of individuals you identified in part c. Report the estimate of user 45 (index 43) review of the movie Jumanji (should be the second column, indexed by 1). Additionally, compute and report the average difference between the true (non-missing) values in the review matrix and the reconstructed matrix. (Calculate this as the square root of the mean squared distance between true and reconstructed values ((xtrue-xrecon)) N e. Repeat the same process, but now replacing the missing values in the matrix with the column average (referred to as column mean padding). Using SVD, perform matrix completion on the review matrix using a value of K equal to the number of types" of individuals you identified in part c. Report the estimate of user 45 (index 43) review of the movie Jumanji. Additionally, compute and report the average difference between the true (non-missing) values in the review matrix and the reconstructed matrix. You will continue working on parts f through k of this question in the staff-graded portion of this week's homework. Download the following data set. The file contains a matrix of user movie reviews. Each row represents an individual user and each column represents a different movie. The value within the cell in the ith row and jth column represents user 's review on the jth movie. The matrix is sparse, as not reviewers did not review every movie in the data. Thus, there is missing data. The first row of the matrix is different from the rest of the rows in that it represents the genre of the movie. You may wish to remove the first row from the rest of the data, but keep it stored separately. a. Examine the data in the matrix. What is the highest review given? What is the lowest review? What is the overall average review? b. Examine the preferences of the individual 1462 (indexed by 1460 after removing the rows representing the movie names and genres, this individual's first five reviews should be 4-Toy Story, 3-Jumanji, 2.5- Grumpier Old Men, missing, 3-Father of the Bride). What genre of movie does this individual tend to give review scores of 5? How does this individual differ individual 45 (indexed by 43, this individual's first six reviews should be 2.5-Toy Story, missing, missing, missing, missing, 4-Heat)? What type of movie does this individual rate highly? For each individual, list at least two films that the user gave the highest possible score. c. Without performing any computations or using the data, estimate how many types" of individuals you would expect to find in this data in terms of genre preferences. Justify your answer. (This question will graded, but any answer given proper justification will be accepted. The correctness of your future answers will be based on your response to this question.) d. Replace the missing values in the matrix with zeros. Using SVD, perform matrix completion on the review matrix using a value of K equal to the number of types" of individuals you identified in part c. Report the estimate of user 45 (index 43) review of the movie Jumanji (should be the second column, indexed by 1). Additionally, compute and report the average difference between the true (non-missing) values in the review matrix and the reconstructed matrix. (Calculate this as the square root of the mean squared distance between true and reconstructed values ((xtrue-xrecon)) N e. Repeat the same process, but now replacing the missing values in the matrix with the column average (referred to as column mean padding). Using SVD, perform matrix completion on the review matrix using a value of K equal to the number of types" of individuals you identified in part c. Report the estimate of user 45 (index 43) review of the movie Jumanji. Additionally, compute and report the average difference between the true (non-missing) values in the review matrix and the reconstructed matrix. You will continue working on parts f through k of this question in the staff-graded portion of this week's homework.
Expert Answer:
Posted Date:
Students also viewed these mathematics questions
-
The general term that refers to the tendency of a parcel of air to either remain in place or change its initial position is ________. a. adiabatic b. conditional instability c. stasis d. stability
-
Return again to the foam finger question of last week. Recall that the market for giant foam fingers is very competitive and the cost of one firm is given by C(q) = q2 10q + 64.All firms are...
-
Three years ago, Mrs. Best purchased 1,000 shares of NN stock from an unrelated party for $12 per share. After her purchase, the value of the shares steadily declined. Two weeks ago, an page 16-43...
-
What are two ways of recording raw materials inventory costs in a standard costing system? Which method is preferable for purposes of control? Why?
-
At the beginning of the year, Tennyson Auto Parts had an accounts receivable balance of $31,800 and a balance in the allowance for doubtful accounts of $2,980 (credit). During the year Tennyson had...
-
Task 3.2: Create an ER diagram You can do this in any program of your choice (e.g., MS Word, draw.io (online), etc.). Task 3.3: Convert the ER Diagram into tables For this task, you are not required...
-
Which aspect of the Great Awakening had significant political influence? a . its condemnation of alcohol b . its view of wealth above all as evidence of God's favor c . its focus on science over...
-
Determining PB Ratio for Companies with Different Returns Assume that the present value of expected ROPI follows a perpetuity with growth g (Value = Amount/ [r - g]). Determine the theoretically...
-
George is interested in buying one of the two local businesses on sale: a Coffee Shop or a Shoe Store. The cashflows of both the businesses are shown below. George can borrow from a bank at a rate of...
-
Calculate o for a binomial distribution with n=15 repeated trials and probability of success P=0.9. (Round your answer, as needed)
-
Cedric cease to be gainfully employed at the age of 43, which ironically was his birthday, which was on the 26th January, 2024. Cedric commenced his employment on 26 January, 1996 and would have...
-
Avi Erlander currently holds corporate bonds yielding six percent and government bonds yielding four percent. He is fascinated by the world of finance and is always on the lookout for higher yields....
-
More than 218 million Words with Friends player accounts were affected - including players' email addresses, names, login IDs and more - when a hacker got into one of the game's databases and...
-
On the basis of the details of the following fixed asset account, indicate the items to be reported on the statement of cashflows: ACCOUNT Land ACCOUNT NO. Balance Date Item Debit Credit Debit Credit...
![Mobile App Logo](https://dsd5zvtm8ll6.cloudfront.net/includes/images/mobile/finalLogo.png)
Study smarter with the SolutionInn App