Question: Unit 3 Problem Set NAME: Elements of Statistics--FHSU Virtual College--Spring 2017 REMEMBER, these are assessed preparatory problems related to the content of Unit 3. The
Unit 3 Problem Set NAME: Elements of Statistics--FHSU Virtual College--Spring 2017 REMEMBER, these are assessed preparatory problems related to the content of Unit 3. The Unit 3 Exam will consist of similar types of problems, but not exactly the same. Thus, make sure you are thinking about the concepts and procedures you studied in this unit versus simply \"copying\" the process of an example problem. Also, take time to examine the complete objective list in the Unit 3 Review document. Listed out to the left of the spreadsheet are text chapter separators if you find yourself needing some direction to a related resource. All answers should be calculated, as needed, within this Excel sheet, and final concluding answers given directly below or to the right of the problem. Please make your answers are easily found--for example use a different color or type of font. No numerical answer resulting from a calculation will be accepted unless the process is performed in Excel and formulas/calculations used are evident when the cell is selected. Also, note that the templates for hypothesis testing provided in the Excel Guides for this unit are also given in the next worksheet in this document--see folder tabs at the bottom of the sheet. You may use these templates by copying from the second worksheet, pasting the copy to the right of the associated problem, then changing values as needed. Problems related to text's Chapter 7: 1. Assume you need to build a confidence interval for a population mean within some given situation. Naturally, you must determine whether you should use either the t-distribution or the z-distribution or possibly even neither based upon the information known/collected in the situation. Thus, based upon the information provided for each situation below, determine which (t-, z- or neither) distribution is appropriate. Then if you can use either a t- or z- distribution, give the associated critical value (critical t- or z- score) from that distribution to reach the given confidence level. a. 99% confidence n=75 Appropriate distribution: Associated critical value: known population data believed to be very skewed d. 95% confidence n=8 Appropriate distribution: Associated critical value: known population data believed to be very skewed c. 90% confidence n=73 Appropriate distribution: Associated critical value: unknown population data believed to be skewed e. 99% confidence n=10 Appropriate distribution: Associated critical value: unknown population data believed to be normally distributed 2. The data below shows the birth weights (in kilograms) of thirty randomly chosen male babies born in Hays Medical in past year. It is also known that the population standard deviation of birth weights for all male babies born is 0.0731 kg (based on data from the New York State Department of Health). 3.73 3.96 2.55 4.37 2.21 3.44 3.73 2.67 3.07 4.33 4.09 4.23 3.3 3.02 2.92 3.39 2.76 3.55 3.68 3.67 3.92 4.68 3.76 3.41 3.72 3.45 4.14 a. How do you know that you will need to construct the confidence interval using a z-distribution approach as opposed to a tdistribution? We want to construct the mean value confidence interval for all Hays male babies' birth weights with a 99% confidence level. b. Determine the best point estimate (average) for the mean birth weight. c. Determine the critical z-value(s) associated with the 99% confidence level. d. Determine the margin of error. e. Determine the confidence interval. f. In a sentence, interpret the contextual meaning of your result to part e above...that is relate the values to this situation regarding the mean birth weights of all Hays male babies born. 3.02 2.54 3.15 3. Determine the two chi-squared (2) critical values for the following confidence levels and sample sizes. a. 98% and n=25 b. 90% and n=60 4. We are also interested in estimating the population standard deviation for all male babies birth weights (in kilograms). We will assume that birth weights are at least approximately normally distributed. Below are the birth weights of 30 randomly chosen male babies from Hays Medical. 3.73 3.96 2.55 4.37 2.21 3.44 3.73 2.67 3.07 4.33 4.09 4.23 3.3 3.02 2.92 3.39 2.76 3.55 3.68 3.67 3.92 4.68 3.76 3.41 3.72 3.45 4.14 Out to the right, construct a 99% confidence interval estimate of sigma (), the population standard deviation. Problems related to text's Chapter 8: 5. (Multiple Choice) A hypothesis test is used to test a claim. On a right-tailed hypothesis test with a 1.39 critical value, the collected sample's test statistic is calculated to be 1.41. Which of the following is the correct decision statement for the test? A. Fail to reject the null hypothesis B. Reject the null hypothesis C. Claim the alternative hypothesis is true D. Claim the null hypothesis is false 6. (Multiple Choice) A hypothesis test is used to test a claim. A P-value of 0.001 is calculated on the hypothesis test with a significance level set at 0.01. Which of the following is the correct decision statement for the test? A. Claim the null hypothesis is true B. Claim the alternative hypothesis is false C. Reject the null hypothesis D. Fail to reject the null hypothesis 7. (Multiple Answers) Which of the following is not a requirement for using the t-distribution for a hypothesis test concerning . A. Sample is a simple random sample B. The population is normally distributed C. The population standard deviation is known D. The sample size is greater than 30 8. A report by the NCAA states that 57.5% of football injuries occur during practices. A head trainer claims that this is too high for his conference, so he randomly selects 36 injuries and finds that 18 occurred during practices. a. Is the above information sufficient for you to be completely agree with the head trainer, that the percentage of football injuries occur during practices is less than 57.5%? Why or why not? b. In establishing a statistical hypothesis testing of this situation, give the required null and alternative hypotheses for such a test, if it is desired that the percentage of football injuries occur during practices is less than 57.5%. H0: H1: c. Based on your answer in part b, should you use a right-tailed, a left-tailed, or a two-tailed test? Briefly explain how one determines which of the three possibilities is to be used. d. Describe the possible Type I error for this situation--make sure to state the error in terms of the percent of football injuries occur during practices. 3.02 2.54 3.15 e. Describe the possible Type II error for this situation--make sure to state the error in terms of the percent of football injuries occur during practices. f. Determine the appropriate critical value(s) for this situation given a 0.05 significance level. g. Determine/calculate the value of the sample test statistic. h. Detemine the P-value. i. Based upon your work above, is there statistically sufficient evidence in this sample to support that less than 57.5% injuries occur at practice for this conference? Briefly explain your reasoning. 0 9. The mean score on a certain achievement test at the turn of the century was 73. However, national standards have been implmented which may lead to a change in the mean score. A random sample of 48 scores on this exam taken this year yeilded the following data set. At a 10% significance level, test the claim that the mean of all current test scores is the same as in 2000. 85 77 74 88 89 66 0 70 73 76 86 74 73 82 72 0 82 82 80 76 87 76 77 67 72 49 73 75 82 73 81 30 58 75 72 89 76 18 72 74 60 88 20 99 50 35 78 66 a. Give the null and alternative hypotheses for this test in symbolic form. H0: H1: b. Determine the value of the test statistic. c. Determine the appropriate critical value(s). d Detemine the P-value. e. Is there sufficient evidence to warrant rejection of the claim that the mean achivement score is now is 73, the same as in 2000? Explain your reasoning. Problem related to text's Chapter 9: 10. Listed below are pretest and posttest scores from a study. Using a 5% significance level, is there statistically sufficient evidence to support the claim that the posttest scores were the higher than the pretest scores? Perform an appropriate hypothesis test showing necessary statistical evidence to support your final given conclusion. PreTest 24 11 14 25 17 28 22 PostTest 25 18 16 29 16 29 25 Problems related to text's Chapter 10: 11. Multiple Choice: For each of the following data sets, choose the most appropriate response from the choices below the table. Data Set #1 Data Set #2 x y x y 0.3 3790 -1 -4 0.4 3354 -2 -10 0.5 2986 2 5 0.6 2613 3 4 0.7 2277 -3 -19 0.8 1765 6 -10 0.9 1343 7 -20 1 1151 -1 -4 1.1 510 0 1 A. A strong positive linear relation exists A. A strong positive linear relation exists B. A strong negative linear relation exists B. A strong negative linear relation exists C. A curvilinear relation exists C. A curvilinear relation exists D. No linear relation exists D. No linear relation exists 12. Give a real life example of two variables that are likely to be negatively correlated. Specifically explain why you believe they are negatively correlated. 13. To answer the following, use the given data set for lengths (in inches) and corresponding weights (in pounds) of randomly selected black bears captured in the backcountry of Colorado lengths (inches) weights (pounds) 40 65 64 256 65 216 49 94 47 86 59 189 61 202 49 102 a. Construct a scatterplot for this data set in the region to the right (length as the independent variable, and weight as the dependent.) b. Based on the scatterplot, does it look like a linear regression model is appropriate for this data? Why or why not? c. Add the line-of-best fit (trend line/linear regression line) to your scatterplot. Give the equation of the trend line below. Then give the slope value of the line and explain its meaning to this context. d. Determine the value of the correlation coefficient. Explain what the value tells you about the data pairs? e. Does the value of the correlation coefficient tell you there is or is not statistically significant evidence that correlation exists between the length and weight of black bears? Explain your position. (HINT: application of table A-6 is needed!) f. Based on the above, what is the best predicted weight of a bear with a length of 45 inches? Templates for Hypothesis Testing As stated on the practice exam document, below you will find templates you may use in completing this the one you need, copy it to where you are working, and then input the proper values for the problem yo related to the labels in red. Single Sample Proportion Test: Two-tailed Proportion significance level (alpha) p= x= n= 0.01 0.82 56 73 q= 0.18 p-hat = 0.7671232877 critical values are: -2.575829304 and test statistic = -1.175933319 P-value = 0.2396215232 2.575829 Single Sample Mean Test (or difference in matched paired two samp Two-tailed Mean (sigma known) significance level (alpha) x-bar = mu, = sigma, = n= 0.05 24.85 24 2 25 critical values are: -1.959963985 and test statistic = 2.125 P-value = 0.0335866129 1.959964 Two-tailed Mean (sigma unknown) significance level (alpha) 0.05 x-bar = 110 mu, = 118 s= 12 n= 20 critical values are: -2.093024054 and test statistic = -2.98142397 P-value = 0.0076706134 2.093024 Single Sample Variance/Standard Deviation Test Two-tailed Standard Deviation significance level (alpha) n= 0.05 25 s= = 0.029 0.023 s2 = 0.000841 = 0.000529 2 critical values are: 12.401150217 and test statistic = 38.155009452 P-value = 0.0668516983 39.36408 Inferences About 2 Proportions Two-tailed Proportion (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x= 56 27 n= 843 703 alpha, = 0.05 p-bar = 0.053686934 q-bar = 0.946313066 critical values are: -1.959963985 and 1.959964 test statistic = 2.4341272112 P-value = 0.0149277477 Inferences about 2 Means: Independent Samples Two-tailed Mean (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x-bar = 4.2 1.71 n= 22 22 s= 2.2 0.72 alpha, = 0.05 critical values are: 2.0796138447 and -2.079614 test statistic = 5.0453711835 P-value = 5.38523E-005 find templates you may use in completing this exam. You will want to highlight hen input the proper values for the problem you are working on for quantities One-tailed Proportion significance level (alpha) p= x= n= 0.01 0.82 56 73 q= 0.18 p-hat = 0.7671232877 Left Tailed Rt. Tailed critical value is: -2.326347874 or 2.326348 test statistic = -1.175933319 P-value = 0.1198107616 in matched paired two samples) One-tailed Mean (sigma known) significance level (alpha) 0.05 x-bar = 24.85 mu, = 24 sigma, = 2 n= 25 Left Tailed Rt. Tailed critical value is: -1.644853627 or 1.644854 test statistic = 2.125 P-value = 0.0167933064 One-tailed Mean (sigma unknown) significance level (alpha) 0.05 x-bar = 110 mu, = 118 s= 12 n= 20 Left Tailed Rt. Tailed critical value is: -1.729132812 or 1.729133 test statistic = -2.98142397 P-value = 0.0038353067 tion Test One-tailed Standard Deviation significance level (alpha) n= 0.05 25 s= = 0.029 0.023 s2 = 0.000841 = 0.000529 Left Tailed Rt. Tailed critical values are: 13.848425027 or 36.41503 test statistic = 38.155009452 P-value = 0.0334258492 2 One-tailed Proportion (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x= 50 16 n= 290 123 alpha, = 0.05 p-bar = 0.1598062954 q-bar = 0.8401937046 critical value is: -1.644853627 or 1.644854 test statistic = 1.0736524347 P-value = 0.1414892436 Samples One-tailed Mean (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x-bar = 0.94 1.65 n= 21 8 s= 0.31 0.16 alpha, = 0.05 Right Tail Left Tail critical value is: -1.894578605 or 1.894579 test statistic = -8.051464895 P-value = 4.37452E-005 Unit 3 Problem Set NAME: Elements of Statistics--FHSU Virtual College--Spring 2017 REMEMBER, these are assessed preparatory problems related to the content of Unit 3. The Unit 3 Exam will consist of similar types of problems, but not exactly the same. Thus, make sure you are thinking about the concepts and procedures you studied in this unit versus simply \"copying\" the process of an example problem. Also, take time to examine the complete objective list in the Unit 3 Review document. Listed out to the left of the spreadsheet are text chapter separators if you find yourself needing some direction to a related resource. All answers should be calculated, as needed, within this Excel sheet, and final concluding answers given directly below or to the right of the problem. Please make your answers are easily found--for example use a different color or type of font. No numerical answer resulting from a calculation will be accepted unless the process is performed in Excel and formulas/calculations used are evident when the cell is selected. Also, note that the templates for hypothesis testing provided in the Excel Guides for this unit are also given in the next worksheet in this document--see folder tabs at the bottom of the sheet. You may use these templates by copying from the second worksheet, pasting the copy to the right of the associated problem, then changing values as needed. Problems related to text's Chapter 7: 1. Assume you need to build a confidence interval for a population mean within some given situation. Naturally, you must determine whether you should use either the t-distribution or the z-distribution or possibly even neither based upon the information known/collected in the situation. Thus, based upon the information provided for each situation below, determine which (t-, z- or neither) distribution is appropriate. Then if you can use either a t- or z- distribution, give the associated critical value (critical t- or z- score) from that distribution to reach the given confidence level. a. 99% confidence n=75 known Appropriate distribution: Z Associated critical value: 2.5758293035 population data believed to be very skewed d. 95% confidence n=8 known Appropriate distribution: Neither Associated critical value: population data believed to be very skewed c. 90% confidence n=73 unknown Appropriate distribution: t Associated critical value: 1.6662936961 population data believed to be skewed e. 99% confidence n=10 unknown Appropriate distribution: t Associated critical value: 3.2498355416 population data believed to be normally distributed 2. The data below shows the birth weights (in kilograms) of thirty randomly chosen male babies born in Hays Medical in past year. It is also known that the population standard deviation of birth weights for all male babies born is 0.0731 kg (based on data from the New York State Department of Health). 3.73 3.96 2.55 4.37 2.21 3.44 3.73 2.67 3.07 4.33 4.09 4.23 3.3 3.02 2.92 3.39 2.76 3.55 3.68 3.67 3.92 4.68 3.76 3.41 3.72 3.45 4.14 3.02 2.54 3.15 a. How do you know that you will need to construct the confidence interval using a z-distribution approach as opposed to a tdistribution? Both of these distribution is applicable to the symmetric distribution or when the sample size is large to use CLT. The Z distribution is used in genera when the population standard deviation is known and if the standard deviation is unknown we use the tdistribution. We want to construct the mean value confidence interval for all Hays male babies' birth weights with a 99% confidence level. b. Determine the best point estimate (average) for the mean birth weight. The best point estimate of the population mean is the sample mean which is 3.4820 in this case. 3.4820 c. Determine the critical z-value(s) associated with the 99% confidence level. Critical Z value = 2.5758 2.5758 d. Determine the margin of error. Margin of error = 2.5758*0.0731/square root of 30 = 0.0344 0.0344 e. Determine the confidence interval. Confidence interval = (3.4820-0.0344, 3.4820+0.0344) = (3.4476, 3.5164) 3.4476 3.5164 f. In a sentence, interpret the contextual meaning of your result to part e above...that is relate the values to this situation regarding the mean birth weights of all Hays male babies born. There is a 99% confidence that the true mean birth weight falls within the above confidence interval. 3. Determine the two chi-squared (2) critical values for the following confidence levels and sample sizes. a. 98% and n=25 Lower value = 10.856 Upper value = 42.980 b. 90% and n=60 Lower value = Upper value = 42.339 77.931 4. We are also interested in estimating the population standard deviation for all male babies birth weights (in kilograms). We will assume that birth weights are at least approximately normally distributed. Below are the birth weights of 30 randomly chosen male babies from Hays Medical. 3.73 3.96 2.55 4.37 2.21 3.44 3.73 2.67 3.07 4.33 4.09 4.23 3.3 3.02 2.92 3.39 2.76 3.55 3.68 3.67 3.92 4.68 3.76 3.41 3.72 3.45 4.14 3.02 2.54 3.15 Out to the right, construct a 99% confidence interval estimate of sigma (), the population standard deviation. 0.4519 to The 99% confidence interval estimate of sigma (), the population standard deviation, is (0.4519 to 0.9025). Problems related to text's Chapter 8: 5. (Multiple Choice) A hypothesis test is used to test a claim. On a right-tailed hypothesis test with a 1.39 critical value, the collected sample's test statistic is calculated to be 1.41. Which of the following is the correct decision statement for the test? A. Fail to reject the null hypothesis B. Reject the null hypothesis C. Claim the alternative hypothesis is true D. Claim the null hypothesis is false 6. (Multiple Choice) A hypothesis test is used to test a claim. A P-value of 0.001 is calculated on the hypothesis test with a significance level set at 0.01. Which of the following is the correct decision statement for the test? 0.9025 (Multiple Choice) A hypothesis test is used to test a claim. A P-value of 0.001 is calculated on the hypothesis test with a significance level set at 0.01. Which of the following is the correct decision statement for the test? A. Claim the null hypothesis is true B. Claim the alternative hypothesis is false C. Reject the null hypothesis D. Fail to reject the null hypothesis 7. (Multiple Answers) Which of the following is not a requirement for using the t-distribution for a hypothesis test concerning . A. Sample is a simple random sample B. The population is normally distributed C. The population standard deviation is known D. The sample size is greater than 30 8. A report by the NCAA states that 57.5% of football injuries occur during practices. A head trainer claims that this is too high for his conference, so he randomly selects 36 injuries and finds that 18 occurred during practices. a. Is the above information sufficient for you to be completely agree with the head trainer, that the percentage of football injuries occur during practices is less than 57.5%? Why or why not? The sample proportion is = 18/36 = 0.50. Though it is smaller than the claimed value of 0.575 but the differece is not too large. So this difference may be due to the chance variation and hence we cant say that the percentage of football injuries occur during practices is less than 57.5% with certainity b. In establishing a statistical hypothesis testing of this situation, give the required null and alternative hypotheses for such a test, if it is desired that the percentage of football injuries occur during practices is less than 57.5%. H0: p = 0.575 H1: p < 0.575 c. Based on your answer in part b, should you use a right-tailed, a left-tailed, or a two-tailed test? Briefly explain how one determines which of the three possibilities is to be used. The alternative hypothesis is contianing the less than sign so the left tailed test should be used. We determine the tail of test by looking at the alternative hypothesis. d. Describe the possible Type I error for this situation--make sure to state the error in terms of the percent of football injuries occur during practices. Here, Type I error would be concluding the percentage of football injuries occur during practices is less than 57.5% when the true percentage is not less than 57.5%. e. Describe the possible Type II error for this situation--make sure to state the error in terms of the percent of football injuries occur during practices. Here, Type II error would be concluding the percentage of football injuries occur during practices is not less than 57.5% when the true percentage is less than 57.5%. f. Determine the appropriate critical value(s) for this situation given a 0.05 significance level. As the test is left tialed test for one proportion so critical value at 0.05 significance level is = -Z(0.05) = -1.645 g. Determine/calculate the value of the sample test statistic. -0.9102991274 h. Detemine the P-value. 0.1813323893 i. Based upon your work above, is there statistically sufficient evidence in this sample to support that less than 57.5% injuries occur at practice for this conference? Briefly explain your reasoning. Test statistic is not less than the critical value (p-value is larger than the significance level) so we are not rejecting the null hypothessis. This concludes that there is not statistically sufficient evidence in this sample to support that less than 57.5% injuries occur at practice for this conference. 0 9. The mean score on a certain achievement test at the turn of the century was 73. However, national standards have been implmented which may lead to a change in the mean score. A random sample of 48 scores on this exam taken this year yeilded the following data set. At a 10% significance level, test the claim that the mean of all current test scores is the same as in 2000. 85 77 74 88 89 66 0 70 73 76 86 74 73 82 72 0 82 82 80 76 87 76 77 67 72 49 73 75 82 73 81 30 58 75 72 89 76 18 72 74 60 88 20 99 50 35 78 66 a. Give the null and alternative hypotheses for this test in symbolic form. H0: = 73 H1: 73 b. Determine the value of the test statistic. -1.4796578285 c. Determine the appropriate critical value(s). The test is two tailed with unknown population variance so a t-tet should be used. The critical values are = t(0.05,48-1) = 1.678 1.678 d Detemine the P-value. 0.1456366972 e. Is there sufficient evidence to warrant rejection of the claim that the mean achivement score is now is 73, the same as in 2000? Explain your reasoning. As the p-value is not smaller than the significance level so do not reject the null hypothesis. The data is indicating that, there is not sufficient evidence to warrant rejection of the claim that the mean achivement score is now is 73, the same as in 2000 Problem related to text's Chapter 9: 10. Listed below are pretest and posttest scores from a study. Using a 5% significance level, is there statistically sufficient evidence to support the claim that the posttest scores were the higher than the pretest scores? Perform an appropriate hypothesis test showing necessary statistical evidence to support your final given conclusion. PreTest 24 11 14 PostTest 25 18 16 t-Test: Paired Two Sample for Means Mean Variance Observations PreTest 20.1429 39.1429 7 PostTest 22.5714 33.6190 7 25 17 28 22 29 16 29 25 P-value for this hypothesis test is 0.0233 which is smaller than the significance level of 0.05 thus the null hypothesis is rejected. The data is providing enough evidence to conclude that the posttest scores were the higher than the pretest scores. Note: Test is significant and the average for Posttest is more than average for Pretest so the above conclusion can be made. Pearson Correlation Hypothesized Mean Differe df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail 0.9117 0 6 -2.4975 0.0233 1.9432 0.0467 2.4469 Problems related to text's Chapter 10: 11. Multiple Choice: For each of the following data sets, choose the most appropriate response from the choices below the table. Data Set #1 Data Set #2 x y x y 0.3 3790 -1 -4 0.4 3354 -2 -10 0.5 2986 2 5 0.6 2613 3 4 0.7 2277 -3 -19 0.8 1765 6 -10 0.9 1343 7 -20 1 1151 -1 -4 1.1 510 0 1 A. A strong positive linear relation exists A. A strong positive linear relation exists B. A strong negative linear relation exists B. A strong negative linear relation exists C. A curvilinear relation exists C. A curvilinear relation exists D. No linear relation exists D. No linear relation exists 12. Give a real life example of two variables that are likely to be negatively correlated. Specifically explain why you believe they are negatively correlated. There are plenty of examples for negative correlation in real life. For example the weight of a car and the MPG. As the weight of a car increases the MPG decreases because a heavy car needs more power to run. Thus these two variables are negatively correlated. 13. To answer the following, use the given data set for lengths (in inches) and corresponding weights (in pounds) of randomly selected black bears captured in the backcountry of Colorado lengths (inches) weights (pounds) 40 65 64 256 65 216 49 94 47 86 59 189 61 202 49 102 a. Construct a scatterplot for this data set in the region to the right (length as the independent variable, and weight as the dependent.) Given on right. b. Based on the scatterplot, does it look like a linear regression model is appropriate for this data? Why or why not? Scatter Plot 300 250 Yes a linear model is appropriate. The points are falling close to a positively sloped straight line indicating that the linear relationship is present between these two variables and thus a linear regression model is appropriate. f(x) = 7.6344359627x - 262.9181509754 R = 0.9396501908 200 weights (pounds) 150 c. Add the line-of-best fit (trend line/linear regression line) to your scatterplot. Give the equation of the trend line below. Then give the slope value of the line and explain its meaning to this context. 100 50 The equation of the line is, y = 7.6344x - 262.92. The slope value is 7.6344 indicating that if the length increases by 1 inch then the weight is expected to increase by 7.6344 pounds on an average 0 35 40 Correlation coefficient value is 0.9694 therefore the relationship between these two variables is strong and positive. e. Does the value of the correlation coefficient tell you there is or is not statistically significant evidence that correlation exists between the length and weight of black bears? Explain your position. (HINT: application of table A-6 is needed!) The correlation coefficient is more than the critical value so the value of the correlation coefficient tells me there is statistically significant evidence that correlation exists between the length and weight of black bears f. Based on the above, what is the best predicted weight of a bear with a length of 45 inches? Predicted value = 7.6344*45 - 262.92 = 80.628 pounds 45 50 55 lengths (inches) d. Determine the value of the correlation coefficient. Explain what the value tells you about the data pairs? 0.9694 60 65 70 Templates for Hypothesis Testing As stated on the practice exam document, below you will find templates you may use in completing this the one you need, copy it to where you are working, and then input the proper values for the problem yo related to the labels in red. Single Sample Proportion Test: Two-tailed Proportion significance level (alpha) p= x= n= 0.01 0.82 56 73 q= 0.18 p-hat = 0.7671232877 critical values are: -2.575829304 and test statistic = -1.175933319 P-value = 0.2396215232 2.575829 Single Sample Mean Test (or difference in matched paired two samp Two-tailed Mean (sigma known) significance level (alpha) x-bar = mu, = sigma, = n= 0.05 24.85 24 2 25 critical values are: -1.959963985 and test statistic = 2.125 P-value = 0.0335866129 1.959964 Two-tailed Mean (sigma unknown) significance level (alpha) 0.05 x-bar = 110 mu, = 118 s= 12 n= 20 critical values are: -2.093024054 and test statistic = -2.98142397 P-value = 0.0076706134 2.093024 Single Sample Variance/Standard Deviation Test Two-tailed Standard Deviation significance level (alpha) n= 0.05 25 s= = 0.029 0.023 s2 = 0.000841 = 0.000529 2 critical values are: 12.401150217 and test statistic = 38.155009452 P-value = 0.0668516983 39.36408 Inferences About 2 Proportions Two-tailed Proportion (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x= 56 27 n= 843 703 alpha, = 0.05 p-bar = 0.053686934 q-bar = 0.946313066 critical values are: -1.959963985 and 1.959964 test statistic = 2.4341272112 P-value = 0.0149277477 Inferences about 2 Means: Independent Samples Two-tailed Mean (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x-bar = 4.2 1.71 n= 22 22 s= 2.2 0.72 alpha, = 0.05 critical values are: 2.0796138447 and -2.079614 test statistic = 5.0453711835 P-value = 5.38523E-005 find templates you may use in completing this exam. You will want to highlight hen input the proper values for the problem you are working on for quantities One-tailed Proportion significance level (alpha) p= x= n= 0.01 0.82 56 73 q= 0.18 p-hat = 0.7671232877 Left Tailed Rt. Tailed critical value is: -2.326347874 or 2.326348 test statistic = -1.175933319 P-value = 0.1198107616 in matched paired two samples) One-tailed Mean (sigma known) significance level (alpha) 0.05 x-bar = 24.85 mu, = 24 sigma, = 2 n= 25 Left Tailed Rt. Tailed critical value is: -1.644853627 or 1.644854 test statistic = 2.125 P-value = 0.0167933064 One-tailed Mean (sigma unknown) significance level (alpha) 0.05 x-bar = 110 mu, = 118 s= 12 n= 20 Left Tailed Rt. Tailed critical value is: -1.729132812 or 1.729133 test statistic = -2.98142397 P-value = 0.0038353067 tion Test One-tailed Standard Deviation significance level (alpha) n= 0.05 25 s= = 0.029 0.023 s2 = 0.000841 = 0.000529 Left Tailed Rt. Tailed critical values are: 13.848425027 or 36.41503 test statistic = 38.155009452 P-value = 0.0334258492 2 One-tailed Proportion (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x= 50 16 n= 290 123 alpha, = 0.05 p-bar = 0.1598062954 q-bar = 0.8401937046 critical value is: -1.644853627 or 1.644854 test statistic = 1.0736524347 P-value = 0.1414892436 Samples One-tailed Mean (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x-bar = 0.94 1.65 n= 21 8 s= 0.31 0.16 alpha, = 0.05 Right Tail Left Tail critical value is: -1.894578605 or 1.894579 test statistic = -8.051464895 P-value = 4.37452E-005 Unit 3 Problem Set NAME: Elements of Statistics--FHSU Virtual College--Spring 2017 REMEMBER, these are assessed preparatory problems related to the content of Unit 3. The Unit 3 Exam will consist of similar types of problems, but not exactly the same. Thus, make sure you are thinking about the concepts and procedures you studied in this unit versus simply \"copying\" the process of an example problem. Also, take time to examine the complete objective list in the Unit 3 Review document. Listed out to the left of the spreadsheet are text chapter separators if you find yourself needing some direction to a related resource. All answers should be calculated, as needed, within this Excel sheet, and final concluding answers given directly below or to the right of the problem. Please make your answers are easily found--for example use a different color or type of font. No numerical answer resulting from a calculation will be accepted unless the process is performed in Excel and formulas/calculations used are evident when the cell is selected. Also, note that the templates for hypothesis testing provided in the Excel Guides for this unit are also given in the next worksheet in this document--see folder tabs at the bottom of the sheet. You may use these templates by copying from the second worksheet, pasting the copy to the right of the associated problem, then changing values as needed. Problems related to text's Chapter 7: 1. Assume you need to build a confidence interval for a population mean within some given situation. Naturally, you must determine whether you should use either the t-distribution or the z-distribution or possibly even neither based upon the information known/collected in the situation. Thus, based upon the information provided for each situation below, determine which (t-, z- or neither) distribution is appropriate. Then if you can use either a t- or z- distribution, give the associated critical value (critical t- or z- score) from that distribution to reach the given confidence level. a. 99% confidence n=75 known Appropriate distribution: Z Associated critical value: 2.5758293035 population data believed to be very skewed d. 95% confidence n=8 known Appropriate distribution: Neither Associated critical value: population data believed to be very skewed c. 90% confidence n=73 unknown Appropriate distribution: t Associated critical value: 1.6662936961 population data believed to be skewed e. 99% confidence n=10 unknown Appropriate distribution: t Associated critical value: 3.2498355416 population data believed to be normally distributed 2. The data below shows the birth weights (in kilograms) of thirty randomly chosen male babies born in Hays Medical in past year. It is also known that the population standard deviation of birth weights for all male babies born is 0.0731 kg (based on data from the New York State Department of Health). 3.73 3.96 2.55 4.37 2.21 3.44 3.73 2.67 3.07 4.33 4.09 4.23 3.3 3.02 2.92 3.39 2.76 3.55 3.68 3.67 3.92 4.68 3.76 3.41 3.72 3.45 4.14 3.02 2.54 3.15 a. How do you know that you will need to construct the confidence interval using a z-distribution approach as opposed to a tdistribution? Both of these distribution is applicable to the symmetric distribution or when the sample size is large to use CLT. The Z distribution is used in genera when the population standard deviation is known and if the standard deviation is unknown we use the tdistribution. We want to construct the mean value confidence interval for all Hays male babies' birth weights with a 99% confidence level. b. Determine the best point estimate (average) for the mean birth weight. The best point estimate of the population mean is the sample mean which is 3.4820 in this case. 3.4820 c. Determine the critical z-value(s) associated with the 99% confidence level. Critical Z value = 2.5758 2.5758 d. Determine the margin of error. Margin of error = 2.5758*0.0731/square root of 30 = 0.0344 0.0344 e. Determine the confidence interval. Confidence interval = (3.4820-0.0344, 3.4820+0.0344) = (3.4476, 3.5164) 3.4476 3.5164 f. In a sentence, interpret the contextual meaning of your result to part e above...that is relate the values to this situation regarding the mean birth weights of all Hays male babies born. There is a 99% confidence that the true mean birth weight falls within the above confidence interval. 3. Determine the two chi-squared (2) critical values for the following confidence levels and sample sizes. a. 98% and n=25 Lower value = 10.856 Upper value = 42.980 b. 90% and n=60 Lower value = Upper value = 42.339 77.931 4. We are also interested in estimating the population standard deviation for all male babies birth weights (in kilograms). We will assume that birth weights are at least approximately normally distributed. Below are the birth weights of 30 randomly chosen male babies from Hays Medical. 3.73 3.96 2.55 4.37 2.21 3.44 3.73 2.67 3.07 4.33 4.09 4.23 3.3 3.02 2.92 3.39 2.76 3.55 3.68 3.67 3.92 4.68 3.76 3.41 3.72 3.45 4.14 3.02 2.54 3.15 Out to the right, construct a 99% confidence interval estimate of sigma (), the population standard deviation. 0.4519 to The 99% confidence interval estimate of sigma (), the population standard deviation, is (0.4519 to 0.9025). Problems related to text's Chapter 8: 5. (Multiple Choice) A hypothesis test is used to test a claim. On a right-tailed hypothesis test with a 1.39 critical value, the collected sample's test statistic is calculated to be 1.41. Which of the following is the correct decision statement for the test? A. Fail to reject the null hypothesis B. Reject the null hypothesis C. Claim the alternative hypothesis is true D. Claim the null hypothesis is false 0.9025 6. (Multiple Choice) A hypothesis test is used to test a claim. A P-value of 0.001 is calculated on the hypothesis test with a significance level set at 0.01. Which of the following is the correct decision statement for the test? A. Claim the null hypothesis is true B. Claim the alternative hypothesis is false C. Reject the null hypothesis D. Fail to reject the null hypothesis 7. (Multiple Answers) Which of the following is not a requirement for using the t-distribution for a hypothesis test concerning . A. Sample is a simple random sample B. The population is normally distributed C. The population standard deviation is known D. The sample size is greater than 30 8. A report by the NCAA states that 57.5% of football injuries occur during practices. A head trainer claims that this is too high for his conference, so he randomly selects 36 injuries and finds that 18 occurred during practices. a. Is the above information sufficient for you to be completely agree with the head trainer, that the percentage of football injuries occur during practices is less than 57.5%? Why or why not? The sample proportion is = 18/36 = 0.50. Though it is smaller than the claimed value of 0.575 but the differece is not too large. So this difference may be due to the chance variation and hence we cant say that the percentage of football injuries occur during practices is less than 57.5% with certainity b. In establishing a statistical hypothesis testing of this situation, give the required null and alternative hypotheses for such a test, if it is desired that the percentage of football injuries occur during practices is less than 57.5%. H0: p = 0.575 H1: p < 0.575 One-tailed Proportion significance level (alpha) p= x= n= 0.05 0.575 18 36 q= p-hat = 0.425 0.5 Left Tailed critical value is: -1.644853627 test statistic = P-value = -0.9102991274 0.1813323893 Rt. Tailed or 1.64485363 and 1.67792672 c. Based on your answer in part b, should you use a right-tailed, a left-tailed, or a two-tailed test? Briefly explain how one determines which of the three possibilities is to be used. The alternative hypothesis is contianing the less than sign so the left tailed test should be used. We determine the tail of test by looking at the alternative hypothesis. d. Describe the possible Type I error for this situation--make sure to state the error in terms of the percent of football injuries occur during practices. Here, Type I error would be concluding the percentage of football injuries occur during practices is less than 57.5% when the true percentage is not less than 57.5%. e. Describe the possible Type II error for this situation--make sure to state the error in terms of the percent of football injuries occur during practices. Here, Type II error would be concluding the percentage of football injuries occur during practices is not less than 57.5% when the true percentage is less than 57.5%. f. Determine the appropriate critical value(s) for this situation given a 0.05 significance level. As the test is left tialed test for one proportion so critical value at 0.05 significance level is = -1.645 g. Determine/calculate the value of the sample test statistic. -0.9102991274 h. Detemine the P-value. 0.1813323893 i. Based upon your work above, is there statistically sufficient evidence in this sample to support that less than 57.5% injuries occur at practice for this conference? Briefly explain your reasoning. Test statistic is not less than the critical value (p-value is larger than the significance level) so we are not rejecting the null hypothessis. This concludes that there is not statistically sufficient evidence in this sample to support that less than 57.5% injuries occur at practice for this conference. 0 9. The mean score on a certain achievement test at the turn of the century was 73. However, national standards have been implmented which may lead to a change in the mean score. A random sample of 48 scores on this exam taken this year yeilded the following data set. At a 10% significance level, test the claim that the mean of all current test scores is the same as in 2000. 85 77 74 88 89 66 0 70 73 76 86 74 73 82 72 0 82 82 80 76 87 76 77 67 72 49 73 75 82 73 81 30 58 75 72 89 76 18 72 74 60 88 20 99 50 35 78 66 a. Give the null and alternative hypotheses for this test in symbolic form. H0: = 73 Two-tailed Mean (sigma unknown) significance level (alpha) 0.1 x-bar = 68.2708333333 mu, = 73 s = 22.1433814936 n= 48 critical values are: test statistic = P-value = -1.6779267216 -1.4796578285 0.1456366972 H1: 73 b. Determine the value of the test statistic. -1.4796578285 c. Determine the appropriate critical value(s). The test is two tailed with unknown population variance so a t-tet should be used. The critical values are 1.678 d Detemine the P-value. 0.1456366972 e. Is there sufficient evidence to warrant rejection of the claim that the mean achivement score is now is 73, the same as in 2000? Explain your reasoning. As the p-value is not smaller than the significance level so do not reject the null hypothesis. The data is indicating that, there is not sufficient evidence to warrant rejection of the claim that the mean achivement score is now is 73, the same as in 2000 Problem related to text's Chapter 9: 10. Listed below are pretest and posttest scores from a study. Using a 5% significance level, is there statistically sufficient evidence to support the claim that the posttest scores were the higher than the pretest scores? Perform an appropriate hypothesis test showing necessary statistical evidence to support your final given conclusion. PreTest 24 11 PostTest 25 18 Difference 1 7 One-tailed Mean (sigma unknown) significance level (alpha) 0.05 x-bar = 2.4285714286 mu, = 0 s = 2.5727509827 14 25 17 28 22 16 29 16 29 25 2 4 -1 1 3 n= critical value is: test statistic = P-value = 7 Left Tailed -1.9431802805 2.4974807451 0.0233435494 Rt. Tailed 1.94318028 or P-value for this hypothesis test is 0.0233 which is smaller than the significance level of 0.05 thus the null hypothesis is rejected. The data is providing enough evidence to conclude that the posttest scores were the higher than the pretest scores. Note: Test is significant and the average for Posttest is more than average for Pretest so the above conclusion can be made. Problems related to text's Chapter 10: 11. Multiple Choice: For each of the following data sets, choose the most appropriate response from the choices below the table. Data Set #1 Data Set #2 x y x y 0.3 3790 -1 -4 0.4 3354 -2 -10 0.5 2986 2 5 0.6 2613 3 4 0.7 2277 -3 -19 0.8 1765 6 -10 0.9 1343 7 -20 1 1151 -1 -4 1.1 510 0 1 A. A strong positive linear relation exists A. A strong positive linear relation exists B. A strong negative linear relation exists B. A strong negative linear relation exists C. A curvilinear relation exists C. A curvilinear relation exists D. No linear relation exists D. No linear relation exists 12. Give a real life example of two variables that are likely to be negatively correlated. Specifically explain why you believe they are negatively correlated. There are plenty of examples for negative correlation in real life. For example the weight of a car and the MPG. As the weight of a car increases the MPG decreases because a heavy car needs more power to run. Thus these two variables are negatively correlated. 13. To answer the following, use the given data set for lengths (in inches) and corresponding weights (in pounds) of randomly selected black bears captured in the backcountry of Colorado lengths (inches) weights (pounds) 40 65 64 256 65 216 49 94 47 86 59 189 61 202 49 102 a. Construct a scatterplot for this data set in the region to the right (length as the independent variable, and weight as the dependent.) Given on right. b. Based on the scatterplot, does it look like a linear regression model is appropriate for this data? Why or why not? Scatter Plot 300 250 Yes a linear model is appropriate. The points are falling close to a positively sloped straight line indicating that the linear relationship is present between these two variables and thus a linear regression model is appropriate. f(x) = 7.6344359627x - 262.9181509754 R = 0.9396501908 200 weights (pounds) 150 c. Add the line-of-best fit (trend line/linear regression line) to your scatterplot. Give the equation of the trend line below. Then give the slope value of the line and explain its meaning to this context. 100 50 The equation of the line is, y = 7.6344x - 262.92. The slope value is 7.6344 indicating that if the length increases by 1 inch then the weight is expected to increase by 7.6344 pounds on an average 0 35 40 Correlation coefficient value is 0.9694 therefore the relationship between these two variables is strong and positive. e. Does the value of the correlation coefficient tell you there is or is not statistically significant evidence that correlation exists between the length and weight of black bears? Explain your position. (HINT: application of table A-6 is needed!) The correlation coefficient is more than the critical value so the value of the correlation coefficient tells me there is statistically significant evidence that correlation exists between the length and weight of black bears f. Based on the above, what is the best predicted weight of a bear with a length of 45 inches? Predicted value = 7.6344*45 - 262.92 = 80.628 pounds 45 50 55 lengths (inches) d. Determine the value of the correlation coefficient. Explain what the value tells you about the data pairs? 0.9694 60 65 70 Templates for Hypothesis Testing As stated on the practice exam document, below you will find templates you may use in completing this the one you need, copy it to where you are working, and then input the proper values for the problem yo related to the labels in red. Single Sample Proportion Test: Two-tailed Proportion significance level (alpha) p= x= n= 0.01 0.82 56 73 q= 0.18 p-hat = 0.7671232877 critical values are: -2.575829304 and test statistic = -1.175933319 P-value = 0.2396215232 2.575829 Single Sample Mean Test (or difference in matched paired two samp Two-tailed Mean (sigma known) significance level (alpha) x-bar = mu, = sigma, = n= 0.05 24.85 24 2 25 critical values are: -1.959963985 and test statistic = 2.125 P-value = 0.0335866129 1.959964 Two-tailed Mean (sigma unknown) significance level (alpha) 0.05 x-bar = 110 mu, = 118 s= 12 n= 20 critical values are: -2.093024054 and test statistic = -2.98142397 P-value = 0.0076706134 2.093024 Single Sample Variance/Standard Deviation Test Two-tailed Standard Deviation significance level (alpha) 0.05 n= s= = s2 = 25 0.029 0.023 0.000841 2 = 0.000529 critical values are: 12.401150217 and test statistic = 38.155009452 P-value = 0.0668516983 39.36408 Inferences About 2 Proportions Two-tailed Proportion (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x= 56 27 n= 843 703 alpha, = 0.05 p-bar = 0.053686934 q-bar = 0.946313066 critical values are: -1.959963985 and 1.959964 test statistic = 2.4341272112 P-value = 0.0149277477 Inferences about 2 Means: Independent Samples Two-tailed Mean (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x-bar = 4.2 1.71 n= 22 22 s= 2.2 0.72 alpha, = 0.05 critical values are: 2.0796138447 and -2.079614 test statistic = 5.0453711835 P-value = 5.38523E-005 find templates you may use in completing this exam. You will want to highlight hen input the proper values for the problem you are working on for quantities One-tailed Proportion significance level (alpha) p= x= n= 0.01 0.82 56 73 q= 0.18 p-hat = 0.7671232877 Left Tailed Rt. Tailed critical value is: -2.326347874 or 2.326348 test statistic = -1.175933319 P-value = 0.1198107616 n matched paired two samples) One-tailed Mean (sigma known) significance level (alpha) 0.05 x-bar = 24.85 mu, = 24 sigma, = 2 n= 25 Left Tailed Rt. Tailed critical value is: -1.644853627 or 1.644854 test statistic = 2.125 P-value = 0.0167933064 One-tailed Mean (sigma unknown) significance level (alpha) 0.05 x-bar = 110 mu, = 118 s= 12 n= 20 Left Tailed Rt. Tailed critical value is: -1.729132812 or 1.729133 test statistic = -2.98142397 P-value = 0.0038353067 ion Test One-tailed Standard Deviation significance level (alpha) 0.05 n= s= = s2 = 25 0.029 0.023 0.000841 2 = 0.000529 Left Tailed Rt. Tailed critical values are: 13.848425027 or 36.41503 test statistic = 38.155009452 P-value = 0.0334258492 One-tailed Proportion (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x= 50 16 n= 290 123 alpha, = 0.05 p-bar = 0.1598062954 q-bar = 0.8401937046 critical value is: -1.644853627 or 1.644854 test statistic = 1.0736524347 P-value = 0.1414892436 Samples One-tailed Mean (w/ two Ind. Samples) Given info: From Sample #1 From Sample #2 x-bar = 0.94 1.65 n= 21 8 s= 0.31 0.16 alpha, = 0.05 Right Tail Left Tail critical value is: -1.894578605 or 1.894579 test statistic = -8.051464895 P-value = 4.37452E-005 Unit 3 Problem Set NAME: Elements of Statistics--FHSU Virtual College--Spring 2017 REMEMBER, these are assessed preparatory problems related to the content of Unit 3. The Unit 3 Exam will consist of similar types of problems, but not exactly the same. Thus, make sure you are thinking about the concepts and procedures you studied in this unit versus simply \"copying\" the process of an example problem. Also, take time to examine the complete objective list in the Unit 3 Review document. Listed out to the left of the spreadsheet are text chapter separators if you find yourself needing some direction to a related resource. All answers should be calculated, as needed, within this Excel sheet, and final concluding answers given directly below or to the right of the problem. Please make your answers are easily found--for example use a different color or type of font. No numerical answer resulting from a calculation will be accepted unless the process is performed in Excel and formulas/calculations used are evident when the cell is selected. Also, note that the templates for hypothesis testing provided in the Excel Guides for this unit are also given in the next worksheet in this document--see folder tabs at the bottom of the sheet. You may use these templates by copying from the second worksheet, pasting the copy to the right of the associated problem, then changing values as needed. Problems related to text's Chapter 7: 1. Assume you need to build a confidence interval for a population mean within some given situation. Naturally, you must determine whether you should use either the t-distribution or the z-distribution or possibly even neither based upon the information known/collected in the situation. Thus, based upon the information provided for each situation below, determine which (t-, z- or neither) distribution is appropriate. Then if you can use either a t- or z- distribution, give the associated critical value (critical t- or z- score) from that distribution to reach the given confidence level. a. 99% confidence n=75 known Appropriate distribution: Z Associated critical value: 2.5758293035 population data believed to be very skewed d. 95% confidence n=8 known Appropriate distribution: Neither Associated critical value: population data believed to be very skewed c. 90% confidence n=73 unknown Appropriate distribution: t Associated critical value: 1.6662936961 population data believed to be skewed e. 99% confidence n=10 unknown Appropriate distribution: t Associated critical value: 3.2498355416 population data believed to be normally distributed 2. The data below shows the birth weights (in kilograms) of thirty randomly chosen male babies born in Hays Medical in past year. It is also known that the population standard deviation of birth weights for all male babies born is 0.0731 kg (based on data from the New York State Department of Health). 3.73 3.96 2.55 4.37 2.21 3.44 3.73 2.67 3.07 4.33 4.09 4.23 3.3 3.02 2.92 3.39 2.76 3.55 3.68 3.67 3.92 4.68 3.76 3.41 3.72 3.45 4.14 3.02 2.54 3.15 a. How do you know that you will need to construct the confidence interval using a z-distribution approach as opposed to a tdistribution? Both of these distribution is applicable to the symmetric distribution or when the sample size is large to use CLT. The Z distribution is used in genera when the population standard deviation is known and if the standard deviation is unknown we use the tdistribution. We want to construct the mean value confidence interval for all Hays male babies' birth weights with a 99% confidence level. b. Determine the best point estimate (average) for the mean birth weight. The best point estimate of the population mean is the sample mean which is 3.4820 in this case. 3.4820 c. Determine the critical z-value(s) associated with the 99% confidence level. Critical Z value = 2.5758 2.5758 d. Determine the margin of error. Margin of error = 2.5758*0.0731/square root of 30 = 0.0344 0.0344 e. Determine the confidence interval. Confidence interval = (3.4820-0.0344, 3.4820+0.0344) = (3.4476, 3.5164) 3.4476 3.5164 f. In a sentence, interpret the contextual meaning of your result to part e above...that is relate the values to this situation regarding the mean birth weights of all Hays male babies born. There is a 99% confidence that the true mean birth weight falls within the above confidence interval. 3. Determine the two chi-squared (2) critical values for the following confidence levels and sample sizes. a. 98% and n=25 Lower value = 10.856 Upper value = 42.980 b. 90% and n=60 Lower value = Upper value = 42.339 77.931 4. We are also interested in estimating the population standard deviation for all male babies birth weights (in kilograms). We will assume that birth weights are at least approximately normally distributed. Below are the birth weights of 30 randomly chosen male babies from Hays Medical. 3.73 3.96 2.55 4.37 2.21 3.44 3.73 2.67 3.07 4.33 4.09 4.23 3.3 3.02 2.92 3.39 2.76 3.55 3.68 3.67 3.92 4.68 3.76 3.41 3.72 3.45 4.14 3.02 2.54 3.15 Out to the right, construct a 99% confidence interval estimate of sigma (), the population standard deviation. 0.4519 to The 99% confidence interval estimate of sigma (), the population standard deviation, is (0.4519 to 0.9025). Problems related to text's Chapter 8: 5. (Multiple Choice) A hypothesis test is used to test a claim. On a right-tailed hypothesis test with a 1.39 critical value, the collected sample's test statistic is calculated to be 1.41. Which of the following is the correct decision statement for the test? A. Fail to reject the null hypothesis B. Reject the null hypothesis C. Claim the alternative hypothesis is true D. Claim the null hypothesis is false 6. (Multiple Choice) A hypothesis test is used to test a claim. A P-value of 0.001 is calculated on the hypothesis test with a significance level set at 0.01. Which of the following is the correct decision statement for the test? 0.9025 (Multiple Choice) A hypothesis test is used to test a claim. A P-value of 0.001 is calculated on the hypothesis test with a significance level set at 0.01. Which of the following is the correct decision statement for the test? A. Claim the null hypothesis is true B. Claim the alternative hypothesis is false C. Reject the null hypothesis D. Fail to reject the null hypothesis 7. (Multiple Answers) Which of the following is not a requirement for using the t-distribution for a hypothesis test concerning . A. Sample is a simple random sample B. The population is normally distributed C. The population standard deviation is known D. The sample size is greater than 30 8. A report by the NCAA states that 57.5% of football injuries occur during practices. A head trainer claims that this is too high for his conference, so he randomly selects 36 injuries and finds that 18 occurred during practices. a. Is the above information sufficient for you to be completely agree with the head trainer, that the percentage of football injuries occur during practices is less than 57.5%? Why or why not? The sample proportion is = 18/36 = 0.50. Though it is smaller than the claimed value of 0.575 but the differece is not too large. So this difference may be due to the chance variation and hence we cant say that the percentage of football injuries occur during practices is less than 57.5% with certainity b. In establishing a statistical hypothesis testing of this situation, give the required null and alternative hypotheses for such a test, if it is desired that the percentage of football injuries occur during practices is less than 57.5%. H0: p = 0.575 H1: p < 0.575 c. Based on your answer in part b, should you use a right-tailed, a left-tailed, or a two-tailed test? Briefly explain how one determines which of the three possibilities is to be used. The alternative hypothesis is contianing the less than sign so the left tailed test should be used. We determine the tail of test by looking at the alternative hypothesis. d. Describe the possible Type I error for this situation--make sure to state the error in terms of the percent of football injuries occur during practices. Here, Type I error would be concluding the percentage of football injuries occur during practices is less than 57.5% when the true percentage is not less than 57.5%. e. Describe the possible Type II error for this situation--make sure to state the error in terms of the percent of football injuries occur during practices. Here, Type II error would be concluding the percentage of football injuries occur during practices is not less than 57.5% when the true percentage is less than 57.5%. f. Determine the appropriate critical value(s) for this situation given a 0.05 significance level. As the test is left tialed test for one proportion so critical value at 0.05 significance level is = -Z(0.05) = -1.645 g. Determine/calculate the value of the sample test statistic. -0.9102991274 h. Detemine the P-value. 0.1813323893 i. Based upon your work above, is there statistically sufficient evidence in this sample to support that less than 57.5% injuries occur at practice for this conference? Briefly explain your reasoning. Test statistic is not less than the critical value (p-value is larger than the significance level) so we are not rejecting the null hypothessis. This concludes that there is not statistically sufficient evidence in this sample to support that less than 57.5% injuries occur at practice for this conference. 0 9. The mean score on a certain achievement test at the turn of the century was 73. However, national standards have been implmented which may lead to a change in the mean score. A random sample of 48 scores on this exam taken this year yeilded the following data set. At a 10% significance level, test the claim that the mean of all current test scores is the same as in 2000. 85 77 74 88 89 66 0 70 73 76 86 74 73 82 72 0 82 82 80 76 87 76 77 67 72 49 73 75 82 73 81 30 58 75 72 89 76 18 72 74 60 88 20 99 50 35 78 66 a. Give the null and alternative hypotheses for this test in symbolic form. H0: = 73 H1: 73 b. Determine the value of the test statistic. -1.4796578285 c. Determine the appropriate critical value(s). The test is two tailed with unknown population variance so a t-tet should be used. The critical values are = t(0.05,48-1) = 1.678 1.678 d Detemine the P-value. 0.1456366972 e. Is there sufficient evidence to warrant rejection of the claim that the mean achivement score is now is 73, the same as in 2000? Explain your reasoning. As the p-value is not smaller than the significance level so do not reject the null hypothesis. The data is indicating that, there is not sufficient evidence to warrant rejection of the claim that the mean achivement score is now is 73, the same as in 2000 Problem related to text's Chapter 9: 10. Listed below are pretest and posttest scores from a study. Using a 5% significance level, is there statistically sufficient evidence to support the claim that the posttest scores were the higher than the pretest scores? Perform an appropriate hypothesis test showing necessary statistical evidence to support your final given conclusion. PreTest 24 11 14 PostTest 25 18 16 t-Test: Paired Two Sample for Means Mean Variance Observations PreTest 20.1429 39.1429 7 PostTest 22.5714 33.6190 7 25 17 28 22 29 16 29 25 P-value for this hypothesis test is 0.0233 which is smaller than the significance level of 0.05 thus the null hypothesis is rejected. The data is providing enough evidence to conclude that the posttest scores were the higher than the pretest scores. Note: Test is significant and the average for Posttest is more than average for Pretest so the above conclusion can be made. Pearson Correlation Hypothesized Mean Differe df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail 0.9117 0 6 -2.4975 0.0233 1.9432 0.0467 2.4469 Problems related to text's Chapter 10: 11. Multiple Choice: For each of the following data sets, choose the most appropriate response from the choices below the table. Data Set #1 Data Set #2 x y x y 0.3 3790 -1 -4 0.4 3354 -2 -10 0.5 2986 2 5 0.6 2613 3 4 0.7 2277 -3 -19 0.8 1765 6 -10 0.9 1343 7 -20 1 1151 -1 -4 1.1 510 0 1 A. A strong positive linear relation exists A. A strong positive linear relation exists B. A strong negative linear relation exists B. A strong negative linear relation exists C. A curvilinear relation exists C. A curvilinear relation exists D. No linear relation exists D. No linear relation exists 12. Give a real life example of two variables that are likely to be negatively correlated. Specifically explain why you believe they are negatively correlated. There are plenty of examples for negative correlation in real life. For example the weight of a car and the MPG. As the weight of a car increases the MPG decreases because a heavy car needs more power to run. Thus these two variables are negatively correlated. 13. To answer the following, use the given data set for lengths (in inches) and corresponding weights (in pounds) of randomly selected black bears captured in the backcountry of Colorado lengths (inches) weights (pounds) 40 65 64 256 65 216 49 94 47 86 59 189 61 202 49 102 a. Construct a scatterplot for this data set in the region to the right (length as the independent variable, and weight as the dependent.) Given on right. b. Based on the scatterplot, does it look like a linear regression model is appropriate for this data? Why or why not? Scatter Plot 300 250 Yes a linear model is appropriate. The points are falling close to a positively sloped straight line indicating that the linear relationship is present between these two variables and thus a linear regression model is appropriate. f(x) = 7.6344359627x - 262.9181509754 R = 0.9396501908 200 weights (pounds) 150 c. Add the line-of-best fit (trend line/linear regression line) to your scatterplot. Give the equation of the trend line below. Then give the slope value of the line and explain its meaning to this context. 100 50 The equation of the line is, y = 7.6344x - 262.92. The slope value is 7.6344 indicating that if the length increases by 1 inch then the weight is expected to increase by 7.6344 pounds on an average 0 35 40 Correlation coefficient value is 0.9694 therefore the relationship between these two variables is strong and positive. e. Does the value of the correlation coefficient tell you there is or is not statistically