Question: Download the kindle _ data - v 2 . csv dataset from CATS. Question 1 - ( 7 0 points ) a ) Filter the
Download the kindledatavcsv dataset from CATS.
Question points
a Filter the dataset by taking the rows in which number of 'reviews' are greater than and less then Then save it
In this new dataset, create a column named 'Popularity'. Popularity will contain the information about number of reviews and if the book is bestseller, editorspick or a Goodreads choice. To calculate popularity points, use the formula below:
Popularity reviews reviewsisEditorsPick reviewsisGoodReadsChoice
Hint: To do it you should convert the TrueFalse statements into numerical ones. Remember, this true false values mean something, they are not nominal type of categories like blue, green, purple etc. So you should check out other converting functions.
b Show the count, mean, std min, max of the Popularity column.
c Since you find out the mean in 'Popularity', create a loop to count the books which has a greater popularity rate than mean.
Question points
Create a correlation matrix to find the relationship between stars, reviews, price, Popularity, isBestSeller, isEditorsPick, isGoodReadsChoice columns.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
