General Description The purpose of this project is to tell the story of a set of bivariate
Question:
General Description
The purpose of this project is to tell the story of a set of bivariate data using descriptive statistics and exploratory data analysis. You will first examine and describe each variable separately, and then describe the relationship between two variables. This will involve making different representations of each data set, calculating measures of center and variability, as well as looking at the data together using scatterplots and correlation. Your goal is to tell the story of what your data represent and how the two variables are related.
Step 1:Identify a problem that will lead to the collection and analysis of a set of bivariate data.
Both variables MUST be numeric! Meaning you cannot do the color of someone's eyes and their weights. Color is not a numerical variable.
.
Here are some sample questions, be creative as you come up with your topic:
- Do you think there is a relationship between the price of a breakfast cereal and its sugar content?
- Do you think there is a relationship between the number of hours of sleep students get on average and the average number of caffeine drinks they consume each day?
- Do you think there is a relationship between the number of pages in a textbook and its price?
- Do you think there is a relationship between birth order and academic success?
- Do you think there is a relationship between team payroll and winning percentages in professional sports?
Practical Advice: It is often MUCH easier to collect accurate experimental data than accurate survey data.Nonresponse tends to be less of an issue with projects based on experiments than with those based on surveys. If you absolutely must do a survey, make every effort to get a random sample, and try to keep track of the characteristics of nonrespondents. You will have nonresponse; your project will not be penalized for nonresponse if you document it and hypothesize how it might affect your results.
Step 2: Collect and analyze your data.
Assuming your topic has been approved, collect your data if you have not done so in the past 2 weeks.You must have 30 pairs of data minimum.
Data Analysis: You will enter your data into a Google (or excel) spreadsheet to generate graphs and statistics. You need to include printout or copies of all analyses as described below.
- Make Three Different representations of each variable (e.g. histogram, box plot, dot plot, Pareto chart, frequency polygon, cumulative frequency, time series, pie graph, ect.).
- Calculate mean, median, mode, range, standard deviation, and interquartile range for each variable, and discuss any possible outliers and skewness that occurs.
- Make a scatterplot of the variables and calculate the correlation coefficient.
Step 3: report.
Use the following format to interpret your data analyses and write the report. Use the four headings and attach copies of your data and analyses. All aspects should be professional and submitted electronically.
- The problem
Describe the problem you set out to investigate.
- Method
Tell what your data represent, how many cases you collected, and where/how you obtained your sample of data.
- Analysis/Results
- Describe your first variable. Talk about the shape of the data, the center, and the spread, and any interesting feature (e.g. outliers) in the context of the variable.Be sure to refer to the different graphs you generated, embedded in your report.
- Repeat above for the second variable.
- Describe the relationship between the variables as shown in your scatterplot and correlation analysis. Be sure to comment on any interesting features such as outliers or influential values.
- Conclusion/Critique
- What did you learn about the problem you investigated?
- What might have affected your results?
- Are there any biases associated with your data collection?
- What would you do differently if you did this project again?
- Raw Data
Include your raw data at the end of your report.