Question: Individual Take-home Learning Test Univariate Analysis Description Before we look for relationships among variables in a data set, we should investigate the data themselves. How
Individual Take-home Learning Test Univariate Analysis
Description
Before we look for relationships among variables in a data set, we should investigate the data themselves. How many variables do we have? What types are they? Are the data complete, and consistent? What do the distributions of each variable look like? Using the data in Univariates BDAT1005-22W.xlsx, you will investigate a (very small) data set.
Please read through the entire assignment before you start. Lets have a good look at a data set!
Wii Games (40 marks)
The Raw Data tab in your spreadsheet includes sales, release and ratings data for twelve different Wii games, but one of them is missing data. To complete the data, you need to fill in some values in the top entry. BE CAREFUL ABOUT THIS!
- For Name, replace the question marks with your own first name e.g. if your name is Harpreet, your games name should be Wii Harpreet.
- For Genre, enter either Sports or Misc its your choice!
The nine digits of your student number can be represented as X1X2X3X4X5X6X7X8X9
- Add the last digit of your student number, X9 , to 2010 to get the Year_of_Release e.g. if the last digit of your student number is 0 then the year of release of your game is 2010.
- NA Sales for your game is the last four digits of your student number in reverse order, with a decimal point after the first two, or X9X8.X7X6, divided by two e.g. for student number 2003456789, NA Sales would be 98.76 / 2 = 49.38
- EU Sales for your game is the last four digits of your student number in forward order, with a decimal point after the first two, or X6X7.X8X9 divided by two e.g. for student number 2003456789, NA Sales would be 67.89 / 2 = 33.945
Your job is to organize the dataset and produce univariate analysis of it. Perform the following exercise steps in Excel, providing answers as requested in the UA Questions worksheet.
- Organize the dataset. (3 marks)
- Create a copy of the Raw Data worksheet; call it Arranged Data. (1 mark) DO NOT MAKE ANY CHANGES TO THE RAW DATA WORKSHEET this is your original copy of the data; keep it safe and unmolested!
- In the Arranged Data worksheet, leaving the title rows (1:4) unchanged, arrange the independent variables on the left in the following order: Name, Platform, Developer, Publisher, Rating, Year_of_Release, Genre. Then arrange the dependent variables in the following order, starting to the right of the independent variables: NA_Sales, EU_Sales, JP_Sales, Other_Sales, Global_Sales, Critic_Score, Critic_Count, User_Score, User_Count (1 mark)
- Imagine you are working with a much larger dataset, with dozens of variables (columns) and thousands of records (rows). In your Arranged Data worksheet, freeze panes in an appropriate place. (1 mark)
- Sales Figures (4 marks)
- In your Arranged Data worksheet, correctly calculate the Global sales for your game. (1 mark)
- The sales figures are all numbers with two decimal places. Are these values exact, or rounded? How can you tell? (1 mark)
- What could the sales figures represent? That is, what are the units of measurement? (Note that you may want to use this answer later, for chart and axis titles.) Is there only one possibility? (2 marks)
- From your Arranged Data, create a properly sized and fully formatted, titled and labeled pie chart of the user counts for each game. (5 marks)
- Pareto chart (6 marks)
- From your new worksheet, create a properly sized and fully formatted, titled and labeled Pareto chart of the Global sales of all twelve games. You may use either of the two methods discussed in class. (4 marks)
- In terms of Global sales, what place does your game fall in? First, second, twelfth? (1 mark)
- How many games will it take, out of twelve, to make up 50% of Global sales? (1 mark)
- Pivot tables & charts (7 marks)
- Using the data in your Arranged Data sheet, create a pivot table on a new worksheet; in that pivot table, show the sum of Global sales for each of the genres (3 marks)
- On the same worksheet as the pivot table, create a properly sized and fully formatted, titled and labeled Pivot chart of the clustered column type (4 marks)
- Descriptive Statistics (8 marks)
- For all five of the sales variables at once, use the Data Analysis tools to produce Descriptive Statistics from your Arranged Data. Make sure you show labels in the first row and include summary statistics. Accept the default confidence level of 95% (3 marks)
- For each of the five variables, do the descriptive statistics show reasonably normal univariate distribution? For each, exactly how can you tell? (5 marks)
- Box Plots & the 1.5*IQR Rule (7 marks)
- Produce a properly sized and fully formatted, titled and labeled box plot for Global sales. Ensure your box plot has the mean marker, inner and outlier points showing (3 marks)
- Calculate the lower and upper fences for suggesting outliers for this variable. (1 mark)
- Are both high and low outliers possible for this variable? How do you know? (1 mark)
- Does the 1.5*IQR rule suggest any outliers in your data? If so, what are their values, and how do you know? (1 mark)
- What are the two ways you could find the answer about outliers, above. (1 mark)
Evaluation
The assignment is marked out of 40, and worth 5% of the course, with the marks distributed as shown above.
Submissions
Your submission should include only your completed Excel spreadsheet. Submit your results in the Assignments & Tests folder in our Blackboard shell for this course.
You must submit your results as an Excel file in the Univariates Take-home assignment by the given due date and time. If your work is not submitted on time, there is a penalty of 10% of the entire assignment per week late. No submissions will be accepted for marking after end of day Sunday following the end of term (Eastern Time). Your instructors have to finish marking end-of-term work and submit final grades by the following Wednesday.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
