Question: This must be done in python. I have included how the bestsellers.csv file looks like. CSV file: bestsellers.csv Part 1 - Import pandas Make sure
This must be done in python.
I have included how the bestsellers.csv file looks like.



CSV file: bestsellers.csv Part 1 - Import pandas Make sure that you are setup so that you can import pandas and use it in your project. Remember that you can give an alias such as "pd" by using "as pd" in your import statement. Part 2 - Data Summary Get a summary of the overall data and display the following characteristics: - How many entries are there? - What columns does the data contain? - What kind of data is stored in each column? - Are there any missing entries in the data? Part 3 - Display some data Use our .head() method to display the first few rows of data in the dataset. Can you display more or fewer rows using an input here? Try displaying the first few rows of data, but only for a specific column. For example, display the first 5 authors in the dataset, without the other information. Part 4 - Column Statistics Next, you will try to get some important data or patterns out of the dataset. Try using sort(), max() or other methods to manipulate the data and show some important information. For example: - What is the maximum price of a book on the list? - What is the lowest user rating for a book on this list? - How many years of data does this dataset cover? - What is the average price of a book on this list? - How many unique authors are on this list? Try looking at the pandas documentation for datasets to get some good ideas! https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html Part 4 - Advanced methods (Extra) Let's do some more analysis. - Define a function yearlyBest() that will take in the dataset and a year, and will display the Top 5 Best Sellers by the highest number of reviews. - Next, try to find the top 5 authors by number of books they have on this list. - Next, try to find the top 5 authors by total number of reviews for all of their books on the list - Lastly, try to find the top 5 authors by highest average review score for their books on the list Note: One neat tool you may want to use for some of these will be .value_counts() on a column or the whole dataframe. bestsellers.csv x C: > Users > brand > Downloads > Spring 2023> CSCl 3329 python > bestsellers.csv CSV file: bestsellers.csv Part 1 - Import pandas Make sure that you are setup so that you can import pandas and use it in your project. Remember that you can give an alias such as "pd" by using "as pd" in your import statement. Part 2 - Data Summary Get a summary of the overall data and display the following characteristics: - How many entries are there? - What columns does the data contain? - What kind of data is stored in each column? - Are there any missing entries in the data? Part 3 - Display some data Use our .head() method to display the first few rows of data in the dataset. Can you display more or fewer rows using an input here? Try displaying the first few rows of data, but only for a specific column. For example, display the first 5 authors in the dataset, without the other information. Part 4 - Column Statistics Next, you will try to get some important data or patterns out of the dataset. Try using sort(), max() or other methods to manipulate the data and show some important information. For example: - What is the maximum price of a book on the list? - What is the lowest user rating for a book on this list? - How many years of data does this dataset cover? - What is the average price of a book on this list? - How many unique authors are on this list? Try looking at the pandas documentation for datasets to get some good ideas! https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html Part 4 - Advanced methods (Extra) Let's do some more analysis. - Define a function yearlyBest() that will take in the dataset and a year, and will display the Top 5 Best Sellers by the highest number of reviews. - Next, try to find the top 5 authors by number of books they have on this list. - Next, try to find the top 5 authors by total number of reviews for all of their books on the list - Lastly, try to find the top 5 authors by highest average review score for their books on the list Note: One neat tool you may want to use for some of these will be .value_counts() on a column or the whole dataframe. bestsellers.csv x C: > Users > brand > Downloads > Spring 2023> CSCl 3329 python > bestsellers.csv
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
