Question: This data set contains 200000 records on users, book ISBN numbers and ratings. (1) Create a bar plot of the count of book ratings. (2)
This data set contains 200000 records on users, book ISBN numbers and ratings. (1) Create a bar plot of the count of book ratings. (2) Remove from the data rarely rated books (rated by less than 50 users) and rarely rating users (rated less than 20 books). (3) Create a training data set and a test data set such that the test data set contains randomly 20% of the data set. Set the random seed to be 40. (4) Fit user-based KNN, item-based KNN on the training data and compare their performance on the test data. (5) Try if tuning parameter selection could improve the performance
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
