The data is available at: https://github.com/Mcompetitions/M5-methods https://drive.google.com/drive/folders/1wxz-TAfVE7uKGCjh405eCb2Q_pG3kAm9 File 1: calendar.csv Contains information about the dates the products
Question:
The data is available at:
https://github.com/Mcompetitions/M5-methods
https://drive.google.com/drive/folders/1wxz-TAfVE7uKGCjh405eCb2Q_pG3kAm9
File 1: "calendar.csv"
Contains information about the dates the products are sold.
date: The date in a "y-m-d" format.
wm_yr_wk: The id of the week the date belongs to.
weekday: The type of the day (Saturday, Sunday, ..., Friday).
wday: The id of the weekday, starting from Saturday.
month: The month of the date.
year: The year of the date.
event_name_1: If the date includes an event, the name of this event.
event_type_1: If the date includes an event, the type of this event.
event_name_2: If the date includes a second event, the name of this event.
event_type_2: If the date includes a second event, the type of this event.
snap_CA, snap_TX, and snap_WI: A binary variable (0 or 1) indicating whether the stores of CA,TX or WI allow SNAP purchases on the examined date. 1 indicates that SNAP purchases are allowed.
About SNAP;
"The United States federal government provides a nutrition assistance benefit called the Supplement Nutrition Assistance Program (SNAP). SNAP provides low income families and individuals with an Electronic Benefits Transfer debit card to purchase food products. In many states, the monetary benefits are dispersed to people across 10 days of the month and on each of these days 1/10 of the people will receive the benefit on their card."
File 2: "sell_prices.csv"
Contains information about the price of the products sold per store and date.
- store_id: The id of the store where the product is sold.
- item_id: The id of the product.
- wm_yr_wk: The id of the week.
- sell_price: The price of the product for the given week/store. The price is provided per week
(average across seven days). If not available, this means that the product was not sold during the
examined week. Note that although prices are constant at weekly basis, they may change through
time (both training and test set).
File 3: "sales_train.csv"
Contains the historical daily unit sales data per product and store.
- item_id: The id of the product.
- dept_id: The id of the department the product belongs to.
- cat_id: The id of the category the product belongs to.
- store_id: The id of the store where the product is sold.
- state_id: The State where the store is located.
- d_1, d_2, ..., d_i, ... d_1941: The number of units sold at day i, starting from 2011-01-29.
Submission
Task A - Exploratory Data Analysis - please include the code that answers the following questions in your notebook submission in Task B (50 points):
- Which state has the highest sales? (5 points)
- Which department has the highest sales? (5 points)
- Which department has the highest number of products? (5 points)
- Which department has the highest mean price? (5 points)
- Which is the best-performing store? (5 points)
- Which month had the highest sales? (5 points)
- Which weekday do people prefer to do grocery shopping in different states? (3 answers = 15 points)
- Which holiday or event recorded the highest sales? (5 points)
Income Tax Fundamentals 2013
ISBN: 9781285586618
31st Edition
Authors: Gerald E. Whittenburg, Martha Altus Buller, Steven L Gill