Question: 7.22 Using the z table in Appendix B, calculate the following percentages for a z score of 2 0.08: a. Above this z score b.
7.22 Using the z table in Appendix B, calculate the following percentages for a z score of 2 0.08: a. Above this z score b. Below this z score c. At least as extreme as this z score 7.32 If the cutoffs for a z test are -2.58 and 2.58, determine whether you would reject or fail to reject the null hypothesis in each of the following cases: a. z 5 2 0.94 b. z 5 2.12 c. A z score for which 49.6% of the data fall between z and the mean 7.36 Assume that the following set of data represents the responses of 10 participants to three similar statements. The participants rated their agreement with each statement on a scale from 1 to 7. a. There is a piece of dirty data in this data set. Identify it and explain why it is dirty. b. Assume that you have decided to throw out the piece of dirty data you identied in part (a) and replace it with the mean for that variable. What is the new data point? c. Assume that you have decided to throw out the piece of dirty data you identied in part (a) and replace it with the mean of that participant's responses. What is the new data point? 7.40 Height and the z statistic: Imagine a class of thirty-three 15-year-old girls with an average height of 62.6 inches. Remember, m 5 63.8 inches and s 5 2.66 inches. a. Calculate the z statistic. b. How does this sample of girls compare to the distribution of sample means? c. What is the percentile rank for this sample? 7.58 You have just conducted a study testing how well two independent variables, daily sugar intake (as assessed by a 25-item eating habits scale) and physical activity (as assessed by a 20-item daily physical activity scale), predicted the dependent variable of blood sugar levels. There were only 17 participants to start with, and 3 of them dropped out before having their blood sugar levels assessed. In addition, 2 participants left one item blank on the physical activity scale, and 4 other participants left most of the data on the eating habits scale blank. At their debrieng interview, they said they just couldn't estimate food intake with any accuracy. a. What will you do with the data of the 3 participants who dropped out just before having their blood sugar levels assessed? b. What are your options with regard to the data from the 2 participants who left one item blank on the physical activity scale? c. What are your options with regard to the data from the 4 participants who did not respond to most of the items on the eating habits scale? d. Do you recommend using these data at all? If so, how? 8.22 In 2008, the Gallup poll asked people whether or not they were suspicious of steroid use among Olympic athletes. Thirty-ve percent of respondents indicated suspicion when they saw an athlete break a track-and-eld record, with a 4% margin of error. Calculate an interval estimate. 8.46 Cheating with hypothesis testing: Unsavory re-searchers know that one can cheat with hypothesis testing. That is, they know that a researcher can stack the deck in her or his favor, making it easier to reject the null hypothesis. a. If you wanted to make it easier to reject the null hypothesis, what are three specic things you could do? b. Would it change the actual difference between the samples? Why is this a potential problem with hypothesis testing? 8.48 Condence intervals, effect sizes, and tennis serves: Let's assume the average speed of a serve in men's tennis is around 135 mph, with a standard devia-tion of 6.5 mph. Because these statistics are calculated over many years and many players, we will treat them as population parameters. We develop a new training method that will increase arm strength, the force of the tennis swing, and the speed of the serve, we hope. We recruit 9 professional tennis players to use our method. After 6 months, we test the speed of their serves and compute an average of 138 mph. a. Using a 95% condence interval, test the hypothesis that our method makes a difference. b. Compute the effect size and describe its strength. c. Calculate statistical power using an alpha of 0.05, or 5%, and a one-tailed test. d. Calculate statistical power using an alpha of 0.10, or 10%, and a one-tailed test. e. Explain how power is affected by alpha in the cal-culations in parts (c) and (d). 7.22 Using the z table in Appendix B, calculate the following percentages for a z score of 2 0.08: a. Above this z score b. Below this z score c. At least as extreme as this z score a. Above this z score P (z > -0.08) = 0.5319 b. Below this z score P (z < -0.08) = 0.4681 c. At least as extreme as this z score P (z 0.08) + P (z -0.08) = 2 x 0.4681 = 0.9362 7.32 If the cutoffs for a z test are -2.58 and 2.58, determine whether you would reject or fail to reject the null hypothesis in each of the following cases: a. z 5 2 0.94 b. z 5 2.12 c. A z score for which 49.6% of the data fall between z and the mean a. z= -0.94 Since -2.58 < z = -0.98 < 2.58, we fail to reject the null hypothesis. b. z=2.12 Since -2.58 < z = 2.12 < 2.58, we fail to reject the null hypothesis. c. A z score for which 49.6% of the data fall between z and the mean Here the z score is, z = 2.65 Since z = 2.65 > 2.58, we have to reject the null hypothesis. 7.36 Assume that the following set of data represents the responses of 10 participants to three similar statements. The participants rated their agreement with each statement on a scale from 1 to 7. a. There is a piece of dirty data in this data set. Identify it and explain why it is dirty. b. Assume that you have decided to throw out the piece of dirty data you identied in part (a) and replace it with the mean for that variable. What is the new data point? c. Assume that you have decided to throw out the piece of dirty data you identied in part (a) and replace it with the mean of that participant's responses. What is the new data point? a) The dirty data from the data set is that the points which are greater than 7, since the scale is from 1 to 7, the points more than 7 are dirty data. b) The data point of dirty data is, 8, 9, 10. c) The new data point is the mean of new data that is 4.3 which is approximately 5 7.40 Height and the z statistic: Imagine a class of thirty-three 15-year-old girls with an average height of 62.6 inches. Remember, m 5 63.8 inches and s 5 2.66 inches. a. Calculate the z statistic. b. How does this sample of girls compare to the distribution of sample means? c. What is the percentile rank for this sample? a) z=(sample mean - pop mean) / (pop SD / sqrt(sample size)) z=(62.6-63.8)/(2.66/sqrt33) z=-2.59 b) z=-2.59 corresponds to .4952 .5000-.4952=.0048 how does this sample of girls compare to the distribution of sample means? It is in the bottom .48% of the distribution of sample means c) 0.48th percentile 7.58 You have just conducted a study testing how well two independent variables, daily sugar intake (as assessed by a 25-item eating habits scale) and physical activity (as assessed by a 20-item daily physical activity scale), predicted the dependent variable of blood sugar levels. There were only 17 participants to start with, and 3 of them dropped out before having their blood sugar levels assessed. In addition, 2 participants left one item blank on the physical activity scale, and 4 other participants left most of the data on the eating habits scale blank. At their debrieng interview, they said they just couldn't estimate food intake with any accuracy. a. What will you do with the data of the 3 participants who dropped out just before having their blood sugar levels assessed? b. What are your options with regard to the data from the 2 participants who left one item blank on the physical activity scale? c. What are your options with regard to the data from the 4 participants who did not respond to most of the items on the eating habits scale? d. Do you recommend using these data at all? If so, how? a. 3 participants left before having their blood sugar levels tested. So we do not have their values on the dependent variables. One way we can take is to fill the 3 missing values by the average of the remaining 14 participants. But since we have only 17 observations it will not be a good thing to do, because then the error d.f which depends on the total number of independent observations(=n) and the number of predictors = 2, given by n-2 will decrease and the resulting degrees of freedom will be 17-5 = 12. However, if we omit the observations we will get the same error d.f and hence it does not matter much between the two cases. b. These 2 persons have left only one item out of the 20 items blank. Since the observations are correlated between the participants because of the similar kind of physical activity, so we can replace the values of predictors by the average of the value of predictors of the other observations and still avoid change in value of the d.f of errors. c. 4 participants who did not answer to most of the questions on eating habits it will be impractical to fill all the values by the average of the values available from other participants on those eating habit questions. So it will lead to too much dependency in the data and hence error d.f will decrease. Hence we can forego those observations and let us proceed with the other available ones. d. Such type of regression is definitely not allowable because it will lead to very poor results. The more the dependency in the data, the less the error d.f and hence less reliable is the result of the testing on the beta coeffecients. However, if we are to still proceed then we can have a go with replacing by average values wherever missing, but the results will be highly unreliable. 8.22 In 2008, the Gallup poll asked people whether or not they were suspicious of steroid use among Olympic athletes. Thirty-ve percent of respondents indicated suspicion when they saw an athlete break a track-and-eld record, with a 4% margin of error. Calculate an interval estimate. Calculate the standard error for each of the following sample size 8.46 Cheating with hypothesis testing: Unsavory re-searchers know that one can cheat with hypothesis testing. That is, they know that a researcher can stack the deck in her or his favor, making it easier to reject the null hypothesis. a. If you wanted to make it easier to reject the null hypothesis, what are three specic things you could do? b. Would it change the actual difference between the samples? Why is this a potential problem with hypothesis testing? a) To reject a null hypothesis easily set the alpha level low Set the hypothesized mean either too high or too low. The right random sampling method should be used. b) A right random sampling method should be used to come to a proper conclusion with the other conditions. Rejecting the null hypothesis would make any actual difference between the sample 8.48 Condence intervals, effect sizes, and tennis serves: Let's assume the average speed of a serve in men's tennis is around 135 mph, with a standard devia-tion of 6.5 mph. Because these statistics are calculated over many years and many players, we will treat them as population parameters. We develop a new training method that will increase arm strength, the force of the tennis swing, and the speed of the serve, we hope. We recruit 9 professional tennis players to use our method. After 6 months, we test the speed of their serves and compute an average of 138 mph. a. Using a 95% condence interval, test the hypothesis that our method makes a difference. b. Compute the effect size and describe its strength. c. Calculate statistical power using an alpha of 0.05, or 5%, and a one-tailed test. d. Calculate statistical power using an alpha of 0.10, or 10%, and a one-tailed test. e. Explain how power is affected by alpha in the cal-culations in parts (c) and (d). Given population mean=135 mph population standard deviation=6.5 mph Sample mean=138 mph Sample size=n=9 professional tennis players Alpha=level of significance=0.05 alpha/2=0.05/2=0.025 Degrees of freedom=n-1=9-1=8 t alpha/2 for one tail and for 8 degrees of freedom=2.306 95% confidence interval for population mean is given by the formula: =1382.306(6.5/sqrt(9)) =1382.306(6.5)/3 =1384.996 =138-4.996<<138+4.996 =133.004<<142.996 =133<<143(Rounding off to nearest integer) lower limit=133 mph upper limit=143 mph We are 95% confident that the true population mean lies in between 133 and 143 mph. Soluton(b) The critical parameter value is an alternative to the value specified in the null hypothesis. The difference between the critical parameter value and the value from the null hypothesis is called the effect size. That is, the effect size is equal to the critical parameter value minus the value from the null hypothesis effect size=138-135=3 When a researcher designs a study to test a hypothesis, he/she should compute the power of the test (i.e., the likelihood of avoiding a Type II error). Null hypothesis Ho:=135 Alternative Hypothesis Ha : >135 level of significance=0.05 The effect size is equal to the critical parameter value minus the hypothesized value. Thus, effect size is equal effect size=138-135=3 Compute power. The power of the test is the probability of rejecting the null hypothesis, assuming that the true population mean is equal to the critical parameter value. Since the region of acceptance is 133 to 143, the null hypothesis will be rejected when the sampled run time is less than 133 or greater than 143. Therefore, we need to compute the probability that the sampled mean will be less than 133 or greater than 143. To do this, we make the following assumptions: The sampling distribution of the mean is normally distributed. (Because the sample size is relatively large, this assumption can be justified by the central limit theorem.) The mean of the sampling distribution is the critical parameter value, 138 mph The standard error of the sampling distribution is 6.5/sqrt(sample size)=6.5/sqrt(3)=2.1667. we first assess the probability that the sample run time will be less than 130 Since x=Z and 1331352.1667=0.92 we have: P (X<133)=P (Z<0.92) Step 3: Use the standard normal table to conclude that: P (Z<0.92)=0.1788 Since x=Z and 1431352.1667=3.69 we have: P (X<143)=P (Z<3.69) Step 3: Use the standard normal table to conclude that: P (Z<3.69)=0.9999 Thus, the probability that the sample mean is greater than 143 is 1-0.9999=0.0001 The power of the test is the sum of these probabilities:0.1788+0.0001=0.1789 This means that if the true average was 135 mph , we would correctly reject the hypothesis that the sample mean was 138, 17.89 %of the time. Hence, the probability of a Type II error would be very small. Specifically, it would be 1 minus 0.1789 or 0.8211 Conclusion: Accept Null hypothesis. mean speed=135 mph \f\f\f