Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Download the following data set. The file contains a matrix of user movie reviews. Each row represents an individual user and each column represents

image text in transcribedimage text in transcribed

Download the following data set. The file contains a matrix of user movie reviews. Each row represents an individual user and each column represents a different movie. The value within the cell in the ith row and jth column represents user 's review on the jth movie. The matrix is sparse, as not reviewers did not review every movie in the data. Thus, there is missing data. The first row of the matrix is different from the rest of the rows in that it represents the genre of the movie. You may wish to remove the first row from the rest of the data, but keep it stored separately. a. Examine the data in the matrix. What is the highest review given? What is the lowest review? What is the overall average review? b. Examine the preferences of the individual 1462 (indexed by 1460 after removing the rows representing the movie names and genres, this individual's first five reviews should be 4-Toy Story, 3-Jumanji, 2.5- Grumpier Old Men, missing, 3-Father of the Bride). What genre of movie does this individual tend to give review scores of 5? How does this individual differ individual 45 (indexed by 43, this individual's first six reviews should be 2.5-Toy Story, missing, missing, missing, missing, 4-Heat)? What type of movie does this individual rate highly? For each individual, list at least two films that the user gave the highest possible score. c. Without performing any computations or using the data, estimate how many types" of individuals you would expect to find in this data in terms of genre preferences. Justify your answer. (This question will graded, but any answer given proper justification will be accepted. The correctness of your future answers will be based on your response to this question.) d. Replace the missing values in the matrix with zeros. Using SVD, perform matrix completion on the review matrix using a value of K equal to the number of types" of individuals you identified in part c. Report the estimate of user 45 (index 43) review of the movie Jumanji (should be the second column, indexed by 1). Additionally, compute and report the average difference between the true (non-missing) values in the review matrix and the reconstructed matrix. (Calculate this as the square root of the mean squared distance between true and reconstructed values ((xtrue-xrecon)) N e. Repeat the same process, but now replacing the missing values in the matrix with the column average (referred to as column mean padding). Using SVD, perform matrix completion on the review matrix using a value of K equal to the number of types" of individuals you identified in part c. Report the estimate of user 45 (index 43) review of the movie Jumanji. Additionally, compute and report the average difference between the true (non-missing) values in the review matrix and the reconstructed matrix. You will continue working on parts f through k of this question in the staff-graded portion of this week's homework.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Business Statistics

Authors: Norean Sharpe, Richard Veaux, Paul Velleman

3rd Edition

9780321925831

Students also viewed these Mathematics questions