Question: MATH 250- Elements of Statistics Class Data, Spring 2025---FIRST EXAMINATION of Student Data Individual ID# Gender Foot Length Height Age Armspan Number in Family Hair
46MATH 250- Elements of Statistics Class Data, Spring 2025---FIRST EXAMINATION of Student Data Individual ID# Gender Foot Length Height Age Armspan Number in Family Hair Color 1 Female 19.0 154.5 8 133.0 5 Brown 2 Male 29.0 189.0 51 198.0 4 Brown 3 Male 30.5 191.0 23 185.0 4 Brown 4 Male 49.0 192.0 64 194.0 3 Grey 5 Female 24.0 162.5 28 161.5 10 Brown 6 Female 25.0 172.0 43 165.0 3 Blonde 7 Female 24.0 169.5 33 163.5 10 Blonde 8 Female 21.5 157.5 23 155.0 3 Blonde 9 Female 19.0 167.5 19 162.5 3 Brown 10 Male 23.5 175.5 24 175.0 8 Blonde 11 Male 27.0 172.5 19 170.0 3 Brown 12 Female 20.0 157.0 20 161.0 3 Brown 13 Male 24.0 172.5 42 172.0 3 Brown 14 Male 28.0 182.0 28 183.0 6 Brown 15 Female 22.0 160.0 21 140.0 7 Brown 16 Male 30.0 188.0 33 185.5 8 Blonde 17 Female 25.5 172.5 21 172.5 1 Blonde 18 Female 24.0 170.0 19 172.0 4 Black 19 Female 26.0 168.0 29 170.0 2 Black 20 Male 23.0 170.0 26 176.0 3 Brown 21 Female 26.0 168.0 29 170.0 2 Black 22 Male 26.5 190.0 20 196.0 11 Black 23 Female 23.5 152.5 34 144.5 3 Brown 24 Female 26.0 155.0 35 156.0 6 Brown 25 Female 24.5 157.5 19 155.0 4 Brown 26 Female 28.0 173.0 19 186.0 3 Brown 27 Male 27.5 180.0 33 177.0 3 Brown 28 Female 26.0 175.5 20 160.0 4 Brown 29 Female 24.0 170.0 31 168.0 4 Brown 30 Male 28.0 183.0 30 180.5 4 Brown 31 Male 27.0 183.0 30 182.5 14 Blonde 32 Female 24.5 174.0 19 165.0 3 Brown 33 Female 24.5 163.5 21 168.5 5 Blonde 34 Female 22.0 170.0 29 170.0 4 Blonde 35 Female 23.0 160.0 19 165.0 2 Black 36 Female 25.0 155.0 28 153.0 5 Red 37 Female 23.0 152.0 46 152.0 7 Blonde 38 Female 24.5 166.0 44 165.0 11 Brown 39 Male 25.5 183.0 26 183.0 10 Brown 40 Male 25.5 183.5 21 179.5 4 Black 41 Female 23.0 160.0 26 167.0 8 Brown 42 Male 25.5 183.0 31 182.0 7 Brown 43 Female 20.0 155.0 20 154.0 5 Black 44 Male 27.5 179.0 29 150.0 3 Brown 45 Female 26.5 144.5 24 146.0 1 Brown Female 23.0 160.0 37 159.0 3 Brown 47 Male 27.5 187.0 38 185.0 2 Brown 48 Female 24.0 168.0 33 164.0 9 Brown 49 Male 27.0 188.0 36 185.0 2 Brown 50 Female 27.0 169.0 31 161.5 4 Brown 51 Female 23.0 160.0 19 157.0 4 Brown 52 Female 24.0 159.0 32 161.5 4 Brown 53 Female 25.0 165.0 18 153.5 4 Brown 54 Female 23.5 165.5 30 166.5 10 Black 55 Male 27.0 190.0 21 189.0 5 Brown 56 Female 23.0 167.0 35 154.0 2 Brown 57 Male 20.5 168.5 44 171.0 6 Brown 58 Female 24.0 167.5 52 162.5 5 Brown 59 Female 26.0 175.5 20 164.5 5 Blonde 60 Female 22.5 162.5 19 162.5 3 Black 61 Male 25.5 183.0 25 184.0 5 Brown
General Instructions: Please place your name above, then complete the following questions. NOTE: Read the entire document below to get a feel for the activity before continuing. Make sure to save this Excel file using the filename "yournameActivity5.xlsx". Once complete, submit your answers to this activity by attaching your Excel file to the Speadsheet Activity 5 - Probability Distributions assignment in Blackboard. Use the area to the near right in this Excel worksheet when calculating all values/statistics/parameters. Methods/work to calculate values must be shown in this spreadsheet tab in order to receive full credit. (Work for part 1. should be shown in the tab labelled "Original Data Set for Analysis")
overview:
This activity has three major purposes. First, it is designed to show the importance of examining the data prior to performing statistical calculations. Second, the activity should help you recognize the difference between a discrete random variable and a continuous random variable. Finally, the activity is designed to help you see how the descriptive statistical analysis differ for both types of data. The data to be used in answering the questions below comes from the data collected in the first spreadsheet activity. The sample data set collected from the students of this course originally had a size of n = 137. However, to make the set a bit more manageable for beginning statistics, a collection of 61 individuals' data was randomly selected. You may recognize your own data within this set if you were one of the randomly selected individuals. This data is supplied in the attached worksheet titled "Original Data Set for Analysis"...see the tab at the bottom of this document window.
1. The first step with analyzing data is to make sure that all data values were sbumitted correctly and seem to be reasonable/proper measurementsthis is called cleaning the data. In initial analysis of the student data by instructors, there were several mismeasurments or incorrectly given measurements; much of this was cleaned already. More formally one would also look for possible outliers using a process (like the 1.5IQR rule) and decide whether or not to include these data in further analysis. In general, a valid and well established argument should always be given for removal of any data from a data set; removal of any collected data should NOT be done arbitrarily or to skew the data to some desired viewpoint. Analyze the data given in the ATTACHED worksheet (again, see this worksheet below as "Original Data Set for Analysis"). Using the 1.5IQR rule discussed in the first unit, establish there is exactly one outlier within the foot length variable. For our course, examine only the foot length variable for outliers! Once you demonstrate that an outlier exists, give the individual's ID# and data as your answer below to this question #1. FINALLY, copy the data set, excluding the outlier, to the designated region at the right (several columns over to the right on this worksheet). Do not delete the outlier's data on the "Original Data Set for Analysis" tab. We want to keep a record of all the data, but this individual's (the outlier's) data will NOT be used in any other calculations performed in answering questions #2 and #3 below in this activity. NOTE: You should be left with 60 rows of data in the region to the right when finished with this problem, even though the ID# will start with 1 and end at 61.
2. For this problem, we will first assume the data is population data (only these 60). Now we focus only on the Number in Family variable in the data set you copied to the right...in which you deleted the entire row chosen in answering #1 (again your data table should contain 60 individuals' data.) Define the random variable X to be "Number in Family" and complete the following for these sixty data values:
a.) What makes X a discrete random variable and not a continuous one?
b.) In the area to the right, create a probability distribution table showing the possible values of X, the frequency of each value, and the associated relative frequency values P(X) as determined by the collected data.
c.) Determine the expected value (mean) of the random variable X using your probability distribution table created in part b. directly above. (Hint: the requirement is to use only the information in the table you produced in part b., not to use the raw data---review how to produce the mean from a probability distribution table via the text or the Excel Guides for Unit 2.)
MEAN :__________
d.) Determine the standard deviation of the random variable X, again using only the values within your probability distribution table. (You can check your answers by finding the population s.d. of the data on family size, BUT this problem needs to be answered through use of only the probability distribution table built in part b--again see the text and Excel Guides.)
St. Deviation :___________
e.) Frequently, any data outside of the Two- Sigma Rule interval is considered "unusual." Decide if any of the included values of the random variable X are unusual. Give a concluding statement below in regard to your decision. Show work for 2 Sigma Rule to the right of the problem.
f.) Determine the probability that the random variable X is within one standard deviation of the mean. Show work for calculating the lower and upper bounds of the event in question and the probability to the right of the problem. Express your answer in a complete sentence.
3. Now consider your cleaned data in reference to the Footlength variable/factor of the student data. Notice that this variable is categorized as quantitative, continuous, and ratio level in type. (This portion of the activity is Based on Workshop Statistics, Rossman, p. 66)
a.) In the area to the right, copy the Footlength data values (again take the cleaned data set of 60 values) and then sort them in order from least to greatest. From this column of footlength values, create an appropriate frequency table with exactly six classes--remember, we did such frequency charts back in Unit 1. For the next part, b, you may choose to produce the histogram graph at the same time. Finally extend your frequency table to include a relative frequency (probability) column.
b.) Produce a histogram for your frequency table (if you did not do so as you constructed your table in part a). Describe the distribution. Does the distribution of the footlength data appear to be a roughly normal distribution? Explain your answer briefly.
c.) Compute the mean and standard deviation of the footlength data values...not from the frequency table as done in the discrete case above but as done in the first unit, but this time assuming this is SAMPLE DATA of all stats students.
Sample mean, xbar ( x ):_________
Sample std. deviation, s: __________
d.) Determine the proportion of the students in this sample whose footlength is at least 25.5 cm.
e.) Suppose that the footlengths in the population of all university students taking elementary statistics do in fact follow a perfect normal distribution (though our sample group did not) with the population mean = 24.5 cm and population standard deviation = 2.2 cm . Under this assumption, determine the proportion of all students who have footlength measure greater than 25.5 cm.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
