1 Million+ Step-by-step solutions

Use event relationships to fill in the blanks in the table below.

Access the applet in How a Line Works.

a. Use the slider to change the y-intercept of the line, but do not change the slope. Describe the changes that you see in the line.

b. Use the slider to change the slope of the line, but do not change the y-intercept. Describe the changes that you see in the line.

Acess the applet called Exploring Correlation.

a. Move the slider in the first applet so that r ≈ .75. Now switch the sign using the button at the bottom of the applet. Describe the change in the pattern of the points.

b. Move the slider in the first applet so that r ≈ 0. Describe the pattern of points on the scatterplot.

c. Refer to part b. In the second applet labeled Correlation and the Quadrants, with r ≈ 0, count the number of points falling in each of the four quadrants of the scatterplot. Is the distribution of points in the quadrants relatively uniform, or do more points fall into certain quadrants than others?

d. Use the second applet labeled Correlation and the Quadrants and change the correlation coefficient to r ≈ - 0.9. Is the distribution of points in the quadrants relatively uniform, or do more points fall into certain quadrants than others? What happens if r ≈ 0.9?

e. Use the third applet labeled Correlation and the Regression Line. Move the slider to see the relationship between the correlation coefficient r, the slope of the regression line and the direction of the relationship between x and y. Describe the relationship.

The table below shows the prices of 8 single handset cordless phones along with their overall score (on a scale of 0â€“100) in a consumer rating survey presented by Consumer Reports.

a. Calculate the correlation coefficient r between price and overall score. How would you describe the relationship between price and overall score?

b. Use the applet called Correlation and the Scatterplot to plot the eight data points. What is the correlation coefficient shown on the applet? Compare with the value you calculated in part a.

c. Describe the pattern that you see in the scatterplot. What unexpected relationship do you see in the data?

A USA Today Snapshot reports that among people 35 to 65 years old, nearly two thirds say they are not concerned about being forced into retirement. Suppose that we randomly select n = 15 individuals that in this age category and approximate the value of p as p = .7. Let x be the number that say they are not concerned with forced retirement.

a. What is the probability distribution for x?

b. What is P(x ≤ 8)?

c. Find the probability that x exceeds 8?

d. What is the largest value of c for which P(x ≤ c) ≤ .10?

Although teen magazines Teen People, Hachette Filipacche, and Elle Girl folded in 2006, 70% of people in a phone-in poll said teens are still a viable market for print, but they do not want titles that talk to them like they are teens. They read more sophisticated magazines. A sample of n = 400 people are randomly selected.

a. What is the average number in the sample who said that teenagers are still a viable market for print?

b. What is the standard deviation of this number?

c. Within what range would you expect to find the number in the sample who said that there is a viable market for teenage print?

d. If only 225 in a sample of 400 people said that teenagers are still a viable market for print, would you consider this unusual? Explain. What conclusions might you draw from this sample information?

A manufacturer of videotapes ships them in lots of 1200 tapes per lot. Before shipment, 20 tapes are randomly selected from each lot and tested. If none is defective, the lot is shipped. If one or more are defective, every tape in the lot is tested.

a. What is the probability distribution for x, the number of defective tapes in the sample of 20?

b. What distribution can be used to approximate probabilities for the random variable x in part a?

c. What is the probability that a lot will be shipped if it contains 10 defectives? 20 defectives? 30 defectives?

Many employers provide workers with sick/personal days as well as vacation days. Among workers who have taken a sick day when they were not sick, 49% say that they needed a break! Suppose that a random sample of n = 12 workers who took a sick day is selected. Rounding 49% to p = .5, find the probabilities of the following events.

a. What is the probability that more than six workers say that they took a sick day because they needed a break?

b. What is the probability that fewer than five of the workers needed a break?

c. What is the probability that exactly 10 of the workers took a sick day because they needed a break?

Use the applet to find the following:

a. P(x < 6) for n = 22, p = .65

b. P(x = 8) for n = 12, p = .4

c. P(x > 14) for n = 20, p = .5

d. P(2 < x < 6) for n = 15, p = .3

e. P(x ≥ 6) for n = 50, p = .7

Repeat Exercise 6.1. Use Table 3 and fill in the probabilities below.

A normal random variable x has mean μ = 10 and standard deviation σ = 2. Find the probabilities of these x-values:

a. x > 13.5

b. x < 8.2

c. 9.4 < x < 10.6

One method of arriving at economic forecasts is to use a consensus approach. A forecast is obtained from each of a large number of analysts, and the average of these individual forecasts is the consensus forecast. Suppose the individual 2008 January prime interest rate forecasts of economic analysts are approximately normally distributed with the mean equal to 8.5% and a standard deviation equal to 0.2%. If a single analyst is randomly selected from among this group, what is the probability that the analyst’s forecast of the prime rate will take on these values?

a. Exceed 8.75%

b. Be less than 8.375%

Consider a binomial random varible with n = 25 and p = .6. Fill in the blanks below to find some probabilities using the normal approximation.

a. Can we use the normal approximation? Calculate np = ________ and nq = ________

b. Are np and nq both greater than 5? Yes ________ No ________

c. If the answer to part b is yes, calculate μ = np = ________ and σ = npq ________

d. To find the probability of more than 9 successes, what values of x should be included? x = ________

e. To include the entire block of probability for the first value of x = ________, start at ________.

f. Calculate z = z ± .5 - np/√npq = ________

g. Calculate P(x > 9) ≈ P(z > ________) = 1 - ________ = ________.

Students very often ask their professors whether they will be â€œcurving the grades.â€ The traditional interpretation of â€œcurving gradesâ€ required that the grades have a normal distribution, and that the grades will be assigned in these proportions:

a. If the average â€œCâ€ grade is centered at the average grade for all students, and if we assume that the grades are normally distributed, how many standard deviations on either side of the mean will constitute the â€œCâ€ grades?

b. How many deviations on either side of the mean will be the cutoff points for the â€œBâ€ and â€œDâ€ grades?

A political analyst wishes to select a sample of n = 20 people from a population of 2000. Use the random number table to identify the people to be included in the sample.

Refer to Exercise 7.17. To find the probability that the sample mean is between 105 and 110, write down the event of interest _______. When xÌ… = 105 and xÌ… = 110,

Find the probability:

P(_______ < xÌ… < _______) = P(_______ < z < _______)

= _______ - _______ = _______

Refer to Exercise 7.23. Plot the standard error of the mean (SE) versus the sample size n and connect the points with a smooth curve. What is the effect of increasing the sample size on the standard error?

The total amount of vegetation held by the earth’s forests is important to both ecologists and politicians because green plants absorb carbon dioxide. An underestimate of the earth’s vegetative mass, or biomass, means that much of the carbon dioxide emitted by human activities (primarily fossilburning fuels) will not be absorbed, and a climate altering buildup of carbon dioxide will occur. Studies indicate that the biomass for tropical woodlands, thought to be about 35 kilograms per square meter (kg/m^{2}), may in fact be too high and that tropical biomass values vary regionally—from about 5 to 55 kg/m^{2}. Suppose you measure the tropical biomass in 400 randomly selected square-meter plots.

a. Approximate σ, the standard deviation of the biomass measurements.

b. What is the probability that your sample average is within two units of the true average tropical biomass?

c. If your sample average is x̅ = 31.75, what would you conclude about the overestimation that concerns the scientists?

Repeat the instructions in Exercise 7.85 when four dice are tossed.

Two balanced dice are thrown, and the average number on the two upper faces is recorded.

a. Use the values μ = 3.5 and σ = 1.71 from Exercise 7.84. What are the theoretical mean and standard deviation of the sampling distribution for x?

b. Use the Central Limit Theorem applet to toss a single die at least 2000 times. (Your simulation can be done quickly by using the button.) What are the mean and standard deviation of these 2000 observations? What is the shape of the histogram?

c. Compare the results of part b to the actual probability distribution shown in Figure 7.4 and the actual mean and standard deviation in part a.

What are two characteristics of the best point estimator for a population parameter?

Refer to Exercise 8.3. What effect does a larger population variance have on the margin of error?

a. n = 30, σ^{2} = .2

b. n = 30, σ^{2} = .9

c. n = 30, σ^{2} = 1.5

Refer to Exercise 8.5. What effect does an increased sample size have on the margin of error?

a. n = 50, s^{2} = 4

b. n = 500, s^{2} = 4

c. n = 5000, s^{2} = 4

Refer to Exercise 8.7. What effect does increasing the sample size have on the margin of error?

At a time in U.S. history when there appears to be genuine concern about the number of illegal aliens living in the United States, there also appears to be concern over the number of legal immigrants allowed to move to the United States. In a recent poll that included questions about both legal and illegal immigrants to the United States, 51% of the n = 900 registered voters interviewed indicated that the U.S. should decrease the number of legal immigrants entering the United States.

a. What is a point estimate for the proportion of U.S. registered voters who feel that the United States should decrease the number of legal immigrants entering the United States? Calculate the margin of error.

b. The poll reports a margin of error of ± 3%. How was the reported margin of error calculated so that it can be applied to all of the questions in the survey?

One of the major costs involved in planning a summer vacation is the cost of lodging. Even within a particular chain of hotels, costs can vary substantially depending on the type of room and the amenities offered. Suppose that we randomly select 50 billing statements from each of the computer databases of the Marriott, Radisson, and Wyndham hotel chains, and record the nightly room rates.

a. Describe the sampled population(s).

b. Find a point estimate for the average room rate for the Marriott hotel chain. Calculate the margin of error.

c. Find a point estimate for the average room rate for the Radisson hotel chain. Calculate the margin of error.

d. Find a point estimate for the average room rate for the Wyndham hotel chain. Calculate the margin of error.

e. Display the results of parts b, c, and d graphically, using the form shown in Figure 8.5. Use this display to compare the average room rates for the three hotel chains.

The Mars twin rovers, Spirit and Opportunity, which roamed the surface of Mars several years ago, found evidence that there was once water on Mars, raising the possibility that there was once life on the planet. Do you think that the United States should pursue a program to send humans to Mars? An opinion poll conducted by the Associated Press indicated that 49% of the 1034 adults surveyed think that we should pursue such a program.

a. Estimate the true proportion of Americans who think that the United States should pursue a program to send humans to Mars. Calculate the margin of error.

b. The question posed in part a was only one of many questions concerning our space program that were asked in the opinion poll. If the Associated Press wanted to report one sampling error that would be valid for the entire poll, what value should they report?

Find and interpret a 95% confidence interval for a population mean μ for these values:

a. n = 36, x̅ = 13.1, s^{2} = 3.42

b. n = 64, x̅ = 2.73, s^{2} = .1047

Due to a variation in laboratory techniques, impurities in materials, and other unknown factors, the results of an experiment in a chemistry laboratory will not always yield the same numerical answer. In an electrolysis experiment, a class measured the amount of copper precipitated from a saturated solution of copper sulfate over a 30-minute period. The n = 30 students calculated a sample mean and standard deviation equal to .145 and .0051 mole, respectively. Find a 90% confidence interval for the mean amount of copper precipitated from the solution over a 30-minute period.

Do you own an iPod Nano or a Sony Walkman Bean? These and other brands of MP3 players are becoming more and more popular among younger Americans. An iPod survey reported that 54% of 12- to 17-year-olds, 30% of 18- to 34-year-olds, and 13% of 35- to 54-year-olds own MP3 players.6 Suppose that these three estimates are based on random samples of size 400, 350, and 362, respectively.

a. Construct a 95% confidence interval estimate for the proportion of 12- to 17-year-olds who own an MP3 player.

b. Construct a 95% confidence interval estimate for the proportion of 18- to 34-year-olds who own an MP3 player.

A drug was developed for reducing cholesterol levels in heart patients. The cholesterol levels before and after drug treatment were obtained for a random sample of 25 heart patients with the following results:

a. Use the sign test to determine whether or not this drug reduces the cholesterol levels of heart patients. Use Î± = .01.

b. Use the Wilcoxon signed-rank test to test the hypothesis in part a at the 1% level of significance. Are your conclusions the same as those in part a?

In a comparison of the prices of items at five supermarkets, six items were randomly selected and the price of each was recorded for each of the five supermarkets. The objective of the study was to see whether the data indicated differences in the levels of prices among the five supermarkets. The prices are listed in the table.

a. Does the distribution of the prices differ from one supermarket to another? Test using the Friedman Fr-test with Î± = .05.

b. Find the approximate p-value for the test and interpret it.

The table lists the life (in months) of service before failure of a color television circuit board for 8 television sets manufactured by firm A and 10 sets manufactured by firm B. Use the Wilcoxon rank sum test to analyze the data, and test to see whether the life of service before failure of the circuit boards differs for the circuit boards produced by the two manufacturers.

A study of the purchase decisions of three stock portfolio managers, A, B, and C, was conducted to compare the numbers of stock purchases that resulted in profits over a time period less than or equal to 1 year. One hundred randomly selected purchases were examined for each of the managers. Do the data provide evidence of differences among the rates of successful purchases for the three managers? Use the third Chi-Square Test of Independence applet.

A group of 306 people were interviewed to determine their opinion concerning a particular current U.S. foreign policy issue. At the same time, their political affiliation was recorded. Do the data in the table present sufficient evidence to indicate a dependence between party affiliation and the opinion expressed for the sampled population? Use the third Chi-Square Test of Independence applet.

In Exercise 14.13, the color distribution of M&Mâ€™S milk chocolate candies was given. Use the third Goodness-of-Fit applet to verify the results of Exercise 14.13. Do the data substantiate the percentages reported by Mars, Incorporated? Describe the nature of the differences, if there are any.

Three hundred people were surveyed, and were asked to select their preferred brand of laptop computer, given that the prices were equivalent. The results are shown in the table.

Use the first Goodness-of-Fit applet to determine if consumers have a preference for one of the three brands. If a significant difference exists, describe the difference in practical terms. Use Î± = .01.

Use the Chi-Square Probabilities applet to calculate the p-value for the following chi-square tests:

a. X^{2} = .81, df = 3

b. X^{2} = 25.40, df = 13

Use the Chi-Square Probabilities applet to find the rejection region for a chi-square test of specified probabilities for a goodness-of-fit test involving k categories for the following cases:

a. k = 14, α = .005

b. k = 3, α = .05

Use the Chi-Square Probabilities applet to find the value of x^{2} with the following area a to its right:

a. α = .05, df = 15

b. α = .01, df = 11

When you choose a greeting card, do you always look for a humorous card, or does it depend on the occasion? A comparison sponsored by two of the nationâ€™s leading manufacturers of greeting cards indicated a slight difference in the proportions of humorous designs made for three different occasions: Fatherâ€™s Day, Motherâ€™s Day, and Valentineâ€™s Day. To test the accuracy of their comparison, random samples of 500 greeting cards purchased at a local card store in the week prior to each holiday were entered into a computer database, and the results in the table were obtained. Do the data indicate that the proportions of humorous greeting cards vary for these three holidays?

Each model year seems to introduce new colors and different hues for a wide array of vehicles, from luxury cars, to full-size or intermediate models, to compacts and sports cars, to light trucks. However, white and silver/gray continue to make the top five or six colors across all of these categories of vehicles. The top five colors and their percentage of the market share for compact/sports cars are shown in the following table.

To verify the figures, a random sample consisting of 250 compact/sports cars was taken and the color of the vehicles recorded. The sample provided the following counts for the categories given above: 60, 51, 43, 35, and 30, respectively.

a. Is any category missing in the classification? How many vehicles belong to that category?

b. Is there sufficient evidence to indicate that our percentages of compact/sports cars differ from those given? Find the approximate p-value for the test.

How would you rate yourself as a driver? According to a survey conducted by the Field Institute, most Californians think they are good drivers but have little respect for othersâ€™ driving ability. The data show the distribution of opinions according to gender for two different questions, the first rating themselves as drivers and the second rating others as drivers. Although not stated in the source, we assume that there were 100 men and 100 women in the surveyed group.

a. Is there sufficient evidence to indicate that there is a difference in the self-ratings between male and female drivers? Find the approximate p-value for the test.

b. Is there sufficient evidence to indicate that there is a difference in the ratings of other drivers between male and female drivers? Find the approximate p-value for the test.

c. Have any of the assumptions necessary for the analysis used in parts a and b been violated? What affect might this have on the validity of your conclusions?

Parents who are concerned about public school environments and curricula are turning to homeschooling in order to control the content and atmosphere of the learning environments of their children. Although employment as a public school teacher requires a bachelorâ€™s degree in education or a subject area, the educational background of homeschool teachers is quite varied. The educational background of a sample of n 500 parents involved in homeschooling their children in 2003 are provided in the first table that follows, along with the corresponding percentages for parents who homeschooled in 1999. The education levels for U.S. citizens in general are given in the second table.

__Education Level % U.S. Population, 2003__

High school or less.................................................47.5

Some college..........................................................25.3

Bachelorâ€™s degree or higher....................................27.2

a. Is there a significant change in the educational backgrounds of parents who home-schooled their children in 2003 compared with 1999? Use Î± = .01.

b. If there is a significant change in the educational backgrounds of these parents, how would you describe that change?

c. Using the second table, can we determine if home-school teachers have the same educational backgrounds as the U.S. population in general? If not, which groups are underrepresented and which are over-represented?

Is your holiday turkey safe? A â€œnew federal survey found that 13% of turkeys are contaminated with the salmonella bacteria responsible for 1.3 million illnesses and about 500 deaths in a year in the US.â€ Use the table that follows to determine if there is a significant difference in the contamination rate at three processing plants. One hundred turkeys were randomly selected from each of the processing lines at these three plants.

Is there a significant difference in the rate of salmonella contamination among these three processing plants? If there is a significant difference, describe the nature of these differences. Use Î± = .01.

A snapshot in USA Today indicates that there is a gap in church attendence between 20-year-olds and older Americans.12 Suppose that we randomly select 100 Americans in each of five age groups and record the numbers who say they attend church in a typical week.

a. Do the data indicate that the proportion of adults who attend church regularly differs depending on age? Test using Î± = .05.

b. If there are signficant differences in part a, describe the nature of these differences by calculating the proportion of churchgoers in each age category. Where do the significant differences appear to lie?

How long do you wait to have your prescriptions filled? According to USA Today, â€œabout 3 in 10 Americans wait more than 20 minutes to have a prescription filled.â€ Suppose a comparison of waiting times for pharmacies in HMOs and pharmacies in drugstores produced the following results.

a. Is there sufficient evidence to indicate that there is a difference in waiting times for pharmacies in HMOs and pharmacies in drugstores? Use Î± = .01.

b. If we consider only if the waiting time is more than 20 minutes, is there a significant difference in waiting times between pharmacies in HMOs and pharmacies in drugstores at the 1% level of significance?

In 2006, a new law passed in Massachusetts would require all residents to have health insurance. Low-income residents would get state subsidies to help pay insurance premiums, but everyone would pay something for health services. The plan would penalize people without any insurance and charge fees to employers who donâ€™t provide coverage. An ABC News/Washington Post poll4 involving n = 1027 adults nationwide asked the question, â€œWould you support or oppose this plan in your state?â€ The data that follows is based on the results of this study.

a. Are there significant differences in the proportions of those surveyed who support, oppose, and are unsure about this plan among Democrats, Independents, and Republicans? Use Î± = .05.

b. If significant differences exist, describe the nature of the differences by finding the proportions of those who support, oppose, and are unsure for each of the given affiliations.

The percentage of various colors are different for the â€œpeanutâ€ variety of M&Mâ€™S candies, as reported on the Mars, Incorporated website:

A 14-ounce bag of peanut M&Mâ€™S is randomly selected and contains 70 brown, 87 yellow, 64 red, 115 blue, 106 orange, and 85 green candies. Do the data substantiate the percentages reported by Mars, Incorporated? Use the appropriate test and describe the nature of the differences, if there are any.

1. Perform a test of homogeneity for each question and verify the reported p-value of the test.

2. Questions 3, 4, and 7 are concerned with the atmosphere of the library; questions 5 and 6 are concerned with the library staff; and questions 11 and 13 are concerned with the library design. How would you summarize the results of your analyses regarding these seven questions concerning the image of the library?

3. With the information given, is it possible to do any further testing concerning the proportion of favorable versus unfavorable responses for two or more questions simultaneously?

Carole Day and Del Lowenthal studied the responses of young adults in their evaluation of library services. Of the n = 200 young adults involved in the study, n_{1} = 152 were students and n_{2} = 48 were nonstudents. The table presents the percents and numbers of favorable responses for each group to seven questions in which the atmosphere, staff, and design of the library were examined.

The entry in the last column labeled P(x^{2}) is the p-value for testing the hypothesis of no difference in the proportion of students and nonstudents who answer each question favorably. Hence, each question gives rise to a 2 Ã— 2 contingency table.

In an investigation to determine the relationship between the degree of metal corrosion and the length of time the metal is exposed to the action of soil acids, the percentage of corrosion and exposure time were measured weekly.

The data were fitted using the quadratic model, E(y) = Î²_{0} + Î²_{1}x + Î²_{2}x^{2}, with the following results.

a. What percentage of the total variation is explained by the quadratic regression of y on x?

b. Is the regression on x and x_{2} significant at the Î± = .05 level of significance?

c. Is the linear regression coefficient significant when x_{2} is in the model?

d. Is the quadratic regression coefficient significant when x is in the model?

e. The data were fitted to a linear model without the quadratic term with the results that follow. What can you say about the contribution of the quadratic term when it is included in the model?

f. The plot of the residuals from the linear regression model in part e shows a specific pattern. What is the term in the model that seems to be missing?

Does the cost of a plane flight depend on the airline as well as the distance traveled? In Exercise 12.21, you explored the first part of this problem. The data shown in this table compare the average cost and distance traveled for two different airlines, measured for 11 heavily traveled air routes in the United States.

Use a computer package to analyze the data with a multiple regression analysis. Comment on the fit of the model, the significant variables, any interactions that exist, and any regression assumptions that may have been violated. Summarize your results in a report, including printouts and graphs if possible.

The Academic Performance Index (API), described in Exercise 12.11, is a measure of school achievement based on the results of the Stanford 9 Achievement Test. The API scores for eight elementary schools in Riverside County, California, are shown below, along with several other independent variables.

The variables are defined as

x_{1} = 1 if the school was given a financial award for meeting growth goals, 0 if not.

x_{2} = % of students who qualify for free or reduced price meals

x_{3} = % of students who are English Language Learners

x_{4} = % of teachers on emergency credentials

x_{5} = API score in 2000

The MINITAB printout for a first-order regression model is given below.

a. What is the model that has been fit to this data? What is the least-squares prediction equation?

b. How well does the model fit? Use any relevant statistics from the printout to answer this question.

c. Which, if any, of the independent variables are useful in predicting the API, given the other independent variables already in the model? Explain.

d. Use the values of R^{2} and R^{2}(adj) in the following printout to choose the best model for prediction. Would you be confident in using the chosen model for predicting the API score for next year based on a model containing similar variables? Explain.

The video-sharing site YouTube attracted 19.6 million visitors in June 2006, an almost 300% increase from January of that same year. Despite YouTubeâ€™s phenomenal growth, some analysts have questioned whether the site can transition from a free service to one that can make money. The growth trend for YouTube from August 2005 to June 2006 is given the following table.

Linear and quadratic fitted plots for these data follow.

a. Based upon the summary statistics in the line plots, which of the two models better fits the data?

b. Write the equation for the quadratic model.

c. Use the following printout to determine if the quadratic term contributes significant information to the prediction of y, in the presence of the linear term.

An experiment was designed to compare several different types of air pollution monitors. Each monitor was set up and then exposed to different concentrations of ozone, ranging between 15 and 230 parts per million (ppm), for periods of 8â€“72 hours. Filters on the monitor were then analyzed, and the response of the monitor was measured. The results for one type of monitor showed a linear pattern (see Exercise 12.14). The results for another type of monitor are listed in the table.

a. Plot the data. What model would you expect to provide the best fit to the data? Write the equation of that model.

b. Use a computer software package to fit the model from part a.

c. Find the least-squares regression line relating the monitorâ€™s response to the ozone concentration.

d. Does the model contribute significant information for the prediction of the monitorâ€™s response based on ozone exposure? Use the appropriate p-value to make your decision.

e. Find R^{2} on the printout. What does this value tell you about the effectiveness of the multiple regression analysis?

You have a hot grill and an empty hamburger bun, but you have sworn off greasy hamburgers. Would a meatless hamburger do? The data in the table record a flavor and texture score (between 0 and 100) for 12 brands of meatless hamburgers along with the price, number of calories, amount of fat, and amount of sodium per burger. Some of these brands try to mimic the taste of meat, while others do not. The MINITAB printout shows the regression of the taste score y on the four predictor variables: price, calories, fat, and sodium.

MINITAB output for Exercise 13.12

a. Comment on the fit of the model using the statistical test for the overall fit and the coefficient of determination, R^{2}.

b. If you wanted to refit the model by eliminating one of the independent variables, which one would you eliminate? Why?

Is your overall satisfaction with your new pair of walking shoes correlated with the cost of the shoes? Satisfaction scores and prices were recorded for nine different styles and brands of menâ€™s walking shoes, with the following results:

a. Calculate the correlation coefficient r between price and overall score. How would you describe the relationship between price and overall score?

b. Use the applet called Correlation and the Scatterplot to plot the nine data points. What is the correlation coefficient shown on the applet? Compare with the value you calculated in part a.

c. Describe the pattern that you see in the scatterplot. Are there any outliers? If so, how would you explain them?

The makers of the Lexus automobile have steadily increased their sales since their U.S. launch in 1989. However, the rate of increase changed in 1996 when Lexus introduced a line of trucks. The sales of Lexus from 1996 to 2005 are shown in the table:

a. Plot the data using a scatterplot. How would you describe the relationship between year and sales of Lexus?

b. Find the least-squares regression line relating the sales of Lexus to the year being measured?

c. Is there sufficient evidence to indicate that sales are linearly related to year? Use Î± = .05.

d. Predict the sales of Lexus for the year 2006 using a 95% prediction interval.

e. If they are available, examine the diagnostic plots to check the validity of the regression assumptions.

f. If you were to predict the sales of Lexus in the year 2015, what problems might arise with your prediction?

If the experimenter stays within the experimental region, when will the error in predicting a particular value of y be maximum?

In addition to increasingly large bounds on error, why should an experimenter refrain from predicting y for values of x outside the experimental region?

How many weeks can a movie run and still make a reasonable profit? The data that follow show the number of weeks in release (x) and the gross to date (y) for the top 10 movies during a recent week.

a. Plot the points in a scatterplot. Does it appear that the relationship between x and y is linear? How would you describe the direction and strength of the relationship?

b. Calculate the value of r^{2}. What percentage of the overall variation is explained by using the linear model rather than yÌ… to predict the response variable y?

c. What is the regression equation? Do the data provide evidence to indicate that x and y are linearly related? Test using a 5% significance level.

d. Given the results of parts b and c, is it appropriate to use the regression line for estimation and prediction? Explain your answer.

Refer to Exercise 12.11 and data set EX1211 regarding the relationship between the Academic Performance Index (API), a measure of school achievement based on the results of the Stanford 9 Achievement test, and the percentage of students who are considered English Language Learners (ELL). The following table shows the API for eight elementary schools in Riverside County, California, along with the percentage of students at that school who are considered English Language Learners.

a. Use an appropriate program to analyze the relationship between API and ELL.

b. Explain all pertinent details of your analysis.

The number of passes completed and the total number of passing yards for Tom Brady, quarterback for the New England Patriots, were recorded for the 16 regular games in the 2006 football season. Week 6 was a bye and no data was reported.

a. What is the least-squares line relating the total passing yards to the number of pass completions for Tom Brady?

b. What proportion of the total variation is explained by the regression of total passing yards (y) on the number of pass completions (x)?

c. If they are available, examine the diagnostic plots to check the validity of the regression assumptions.

In Exercise 3.19, Consumer Reports gave the prices for the top 10 LCD high definition TVs (HDTVs) in the 30- to 40-inch category: Does the price of an LCD TV depend on the size of the screen? The table below shows the ten costs again, along with the screen size in inches.

Does the price of an HDTV depend on the size of the screen? Suppose we assume that the relationship between x and y is linear, and perform a linear regression, resulting in a value of r^{2} = .787.

a. What does the value of r^{2} tell you about the strength of the relationship between price and screen size?

b. The residual plot for this data, generated by MINITAB, is shown below. Does this plot reveal any outliers in the data set? If so, which point is the outlier?

c. Plot the values of x and y using a scatterplot. Does this plot confirm your suspicions in part b? Which HDTV does the outlier represent? Is this a faulty measurement that should be removed from the data set? Explain.

Refer to the data in Exercise 12.7. The normal probability plot and the residuals versus fitted values plots generated by MINITAB are shown here. Does it appear that any regression assumptions have been violated? Explain.

MINITAB output for Exercise 12.31

What diagnostic plot can you use to determine whether the assumption of equal variance has been violated? What should the plot look like when the variances are equal for all values of x?

How is the cost of a plane flight related to the length of the trip? The table shows the average round-trip coach airfare paid by customers of American Airlines on each of 18 heavily traveled U.S. air routes.

a. If you want to estimate the cost of a flight based on the distance traveled, which variable is the response variable and which is the independent predictor variable?

b. Assume that there is a linear relationship between cost and distance. Calculate the least-squares regression line describing cost as a linear function of distance.

c. Plot the data points and the regression line. Does it appear that the line fits the data?

d. Use the appropriate statistical tests and measures to explain the usefulness of the regression model for predicting cost.

An experiment was designed to compare several different types of air pollution monitors. The monitor was set up, and then exposed to different concentrations of ozone, ranging between 15 and 230 parts per million (ppm) for periods of 8â€“72 hours. Filters on the monitor were then analyzed, and the amount (in micrograms) of sodium nitrate (NO_{3}) recorded by the monitor was measured. The results for one type of monitor are given in the table.

a. Find the least-squares regression line relating the monitorâ€™s response to the ozone concentration.

b. Do the data provide sufficient evidence to indicate that there is a linear relationship between the ozone concentration and the amount of sodium nitrate detected?

c. Calculate r^{2}. What does this value tell you about the effectiveness of the linear regression analysis?

Give the equation and graph for a line with y-intercept equal to - 3 and slope equal to 1.

Give the equation and graph for a line with y-intercept equal to 3 and slope equal to -1.

Each year, the American Association of University Professors reports on salaries of academic professors at universities and colleges in the United States. The following data (in thousands of dollars), adapted from this report, are based on samples of n = 10 in each of three professorial ranks, for both male and female professors.

a. Identify the design used in this survey.

b. Use the appropriate analysis of variance for these data.

c. Do the data indicate that the salary at the different ranks vary by gender?

d. If there is no interaction, determine whether there are differences in salaries by rank, and whether there are differences by gender. Discuss your results.

e. Plot the average salaries using an interaction plot. If the main effect of ranks is significant, use Tukeyâ€™s method of pairwise comparisons to determine if there are significant differences among the ranks. Use Î± = .01.

Refer to Exercise 11.72. The diagnostic plots for this experiment are shown below. Does it appear that any of the analysis of variance assumptions have been violated? Explain.

How satisfied are you with your current mobile-phone service provider? Surveys done by Consumer Reports indicate that there is a high level of dissatisfaction among consumers, resulting in high customer turnover rates.10 The following table shows the overall satisfaction scores, based on a maximum score of 100, for four wireless providers in four different cities.

a. What type of experimental design was used in this article? If the design used is a randomized block design, what are the blocks and what are the treatments?

b. Conduct an analysis of variance for the data.

c. Are there significant differences in the average satisfaction scores for the four wireless providers considered here?

d. Are there significant differences in the average satisfaction scores for the four cities?

In a study of starting salaries of assistant professors, five male assistant professors and five female assistant professors at each of three types of institutions granting doctoral degrees were polled and their initial starting salaries were recorded under the condition of anonymity. The results of the survey in $1000 are given in the following table.

a. What type of design was used in collecting these data?

b. Use an analysis of variance to test if there are significant differences in gender, in type of institution, and to test for a significant interaction of gender Ã— type of institution.

c. Find a 95% confidence interval estimate for the difference in starting salaries for male assistant professors and female assistant professors. Interpret this interval in terms of a gender difference in starting salaries.

d. Use Tukeyâ€™s procedure to investigate differences in assistant professor salaries for the three types of institutions. Use Î± = .01.

e. Summarize the results of your analysis.

In contrast to aptitude tests, which are predictive measures of what one can accomplish with training, achievement tests tell what an individual can do at the time of the test. Mathematics achievement test scores for 400 students were found to have a mean and a variance equal to 600 and 4900, respectively. If the distribution of test scores was mound-shaped, approximately how many of the scores would fall into the interval 530 to 670? Approximately how many scores would be expected to fall into the interval 460 to 740?

How much sleep do you get on a typical school night? A group of 10 college students were asked to report the number of hours that they slept on the previous night with the following results:

7, 6, 7.25, 7, 8.5, 5, 8, 7, 6.75, 6

a. Find the mean and the standard deviation of the number of hours of sleep for these 10 students.

b. Calculate the z-score for the largest value (x̅ = 8.5). Is this an unusually sleepy college student?

c. What is the most frequently reported measurement? What is the name for this measure of center?

d. Construct a box plot for the data. Does the box plot confirm your results in part b?

In the seasons that followed his 2001 record-breaking season, Barry Bonds hit 46, 45, 45, 5, and 26 homers, respectively (www.espn.com). Two boxplots, one of Bondâ€™s homers through 2001, and a second including the years 2002â€“2006, follow.

The statistics used to construct these boxplots are given in the table.

a. Calculate the upper fences for both of these boxplots.

b. Can you explain why the record number of homers is an outlier in the 2001 boxplot, but not in the 2006 boxplot?

Here are a few facts reported as Snapshots in USA Today.

- The median hourly pay for salespeople in the building supply industry is $10.41.
- Sixty-nine percent of U.S. workers ages 16 and older work at least 40 hours per week.
- Seventy-five percent of all Associate Professors of Mathematics in the U.S. earn $91,823 or less.

Identify the variable x being measured, and any percentiles you can determine from this information.

Refer to Data Set #1 in the How Extreme Values Affect the Mean and Median applet. This applet loads with a dotplot for the following n = 5 observations: 2, 5, 6, 9, 11.

a. What are the mean and median for this data set?

b. Use your mouse to change the value x = 11 (the moveable green dot) to x = 13. What are the mean and median for the new data set?

c. Use your mouse to move the green dot to x = 33. When the largest value is extremely large compared to the other observations, which is larger, the mean or the median?

d. What effect does an extremely large value have on the mean? What effect does it have on the median?

Refer to Data Set #2 in the How Extreme Values Affect the Mean and Median applet. This applet loads with a dotplot for the following n = 5 observations: 2, 5, 10, 11, 12.

a. Use your mouse to move the value x = 12 to the left until it is smaller than the value x = 11.

b. As the value of x gets smaller, what happens to the sample mean?

c. As the value of x gets smaller, at what point does the value of the median finally change?

d. As you move the green dot, what are the largest and smallest possible values for the median?

Refer to Data Set #3 in the How Extreme Values Affect the Mean and Median applet. This applet loads with a dotplot for the following n = 5 observations: 27, 28, 32, 34, 37.

a. What are the mean and median for this data set?

b. Use your mouse to change the value x = 27 (the moveable green dot) to x = 25. What are the mean and median for the new data set?

c. Use your mouse to move the green dot to x = 5. When the smallest value is extremely small compared to the other observations, which is larger, the mean or the median?

d. At what value of x does the mean equal the median?

e. What are the smallest and largest possible values for the median?

f. What effect does an extremely small value have on the mean? What effect does it have on the median?

The price of living in the United States has increased dramatically in the past decade, as demonstrated by the consumer price indexes (CPIs) for housing and transportation. These CPIs are listed in the table for the years 1996 through the first five months of 2007.

a. Create side-by-side comparative bar charts to describe the CPIs over time.

b. Draw two line charts on the same set of axes to describe the CPIs over time.

c. What conclusions can you draw using the two graphs in parts a and b? Which is the most effective?

Charitable organizations count on support from both private donations and other sources. Here are the sources of income in a recent year for several well-known charitable organizations in the United States.

a. Construct a stacked bar chart to display the sources of income given in the table.

b. Construct two comparative pie charts to display the sources of income given in the table.

c. Write a short paragraph summarizing the information that can be gained by looking at these graphs. Which of the two types of comparative graphs is more effective?

Below you will find a simple set of bivariate data. Fill in the blanks to find the correlation coefficient.

Use the information from Exercise 3.9 and find the regression line.

The makers of the Lexus automobile have steadily increased their sales since their U.S. launch in 1989. However, the rate of increase changed in 1996 when Lexus introduced a line of trucks. The sales of Lexus from 1996 to 2005 are shown in the table.

a. Plot the data using a scatterplot. How would you describe the relationship between year and sales of Lexus?

b. Find the least-squares regression line relating the sales of Lexus to the year being measured.

c. If you were to predict the sales of Lexus in the year 2015, what problems might arise with your prediction?

In Exercise 2.12, Consumer Reports gave the prices for the top 10 LCD high definition TVs (HDTVs) in the 30- to 40-inch category. Does the price of an LCD TV depend on the size of the screen? The table below shows the 10 costs again, along with the screen size.

a. Which of the two variables (price and size) is the independent variable, and which is the dependent variable?

b. Construct a scatterplot for the data. Does the relationship appear to be linear?

Refer to Exercise 3.19.

Suppose we assume that the relationship between x and y is linear.

a. Find the correlation coefficient, r. What does this value tell you about the strength and direction of the relationship between size and price?

b. What is the equation of the regression line used to predict the price of the TV based on the size of the screen?

c. The Sony Corporation is introducing a new 37" LCD TV. What would you predict its price to be?

d. Would it be reasonable to try to predict the price of a 45" LCD TV? Explain.

Who are the men and women who serve in our armed forces? Are they male or female, officers or enlisted? What is their ethnic origin and their average age? An article in Time magazine provided some insight into the demographics of the U.S. armed forces.9 Two of the bar charts are shown below.

a. What variables have been measured in this study? Are the variables qualitative or quantitative?

b. Describe the population of interest. Do these data represent a population or a sample drawn from the population?

c. What type of graphical presentation has been used? What other type could have been used?

d. How would you describe the similarities and differences in the age distributions of enlisted persons and officers?

e. How would you describe the similarities and differences in the age distributions of personnel in the U.S. Army and the Marine Corps?

The number of passes completed and the total number of passing yards was recorded for Brett Favre for each of the 16 regular season games in the fall of 2006.

a. Draw a scatterplot to describe the relationship between number of completions and total passing yards for Brett Favre.

b. Describe the plot in part a. Do you see any outliers? Do the rest of the points seem to form a pattern?

c. Calculate the correlation coefficient, r, between the number of completions and total passing yards.

d. What is the regression line for predicting total number of passing yards y based on the total number of completions x?

e. If Brett Favre had 20 pass completions in his next game, what would you predict his total number of passing yards to be?

A survey was conducted prior to the 2004 presidential election to explore the relationship between a personâ€™s religious fervor and their choice of a political candidate. Voters were asked how often they attended church and which of the two major presidential candidates (George W. Bush or his democratic opponent) they would favor in the 2004 election. The results are shown below.

a. What variables have been measured in this survey? Are they qualitative or quantitative?

b. Draw side-by-side comparative bar charts to describe the percentages favoring the two candidates, categorized by church attendance.

c. Draw two line charts on the same set of axes to describe the same percentages for the two candidates.

d. What conclusions can you draw using the two graphs in parts b and c? Which is more effective?

The number of passengers x (in millions) and the revenue y (in billions of dollars) for the top nine U.S. airlines in a recent year are given in the following table.

a. Construct a scatterplot for the data.

b. Describe the form, direction, and strength of the pattern in the scatterplot.

You have two groups of distinctly different items, 10 in the first group and 8 in the second. If you select one item from each group, how many different pairs can you form?

You have three groups of distinctly different items, four in the first group, seven in the second, and three in the third. If you select one item from each group, how many different triplets can you form?

Your family vacation involves a cross-country air flight, a rental car, and a hotel stay in Boston. If you can choose from four major air carriers, five car rental agencies, and three major hotel chains, how many options are available for your vacation accommodations?

A French restaurant in Riverside, California, offers a special summer menu in which, for a fixed dinner cost, you can choose from one of two salads, one of two entrees, and one of two desserts. How many different dinners are available?

Probability played a role in the rigging of the April 24, 1980, Pennsylvania state lottery. To determine each digit of the three-digit winning number, each of the numbers 0, 1, 2, . . . , 9 is written on a Ping-Pong ball, the 10 balls are blown into a compartment, and the number selected for the digit is the one on the ball that floats to the top of the machine. To alter the odds, the conspirators injected a liquid into all balls used in the game except those numbered 4 and 6, making it almost certain that the lighter balls would be selected and determine the digits in the winning number. They then proceeded to buy lottery tickets bearing the potential winning numbers. How many potential winning numbers were there (666 was the eventual winner)?

Refer to Exercise 4.117.

Hours after the rigging of the Pennsylvania state lottery was announced on September 19, 1980, Connecticut state lottery officials were stunned to learn that their winning number for the day was 666.

a. All evidence indicates that the Connecticut selection of 666 was pure chance. What is the probability that a 666 would be drawn in Connecticut, given that a 666 had been selected in the April 24, 1980, Pennsylvania lottery?

b. What is the probability of drawing a 666 in the April 24, 1980, Pennsylvania lottery (remember, this drawing was rigged) and a 666 on the September 19, 1980, Connecticut lottery?

The previous question gave the results of the Gallup poll where 48% of a sample of 1,014 adult Americans reported drinking at least one glass of soda pop on a typical day. Use the One Proportion applet, repeatedly testing possible values for π, to determine plausible values and construct a 99% confidence interval for π.

What are the observational/experimental units?

Use the Two Proportions applet to carry out a test of significance.

Let’s use our 3S strategy to help us investigate how much evidence the sample data provide to support our conjecture that Vitamin C prevents colds.

Join SolutionInn Study Help for

1 Million+ Textbook Solutions

Learn the step-by-step answers to your textbook problems, just enter our Solution Library containing more than 1 Million+ textbooks solutions and help guides from over 1300 courses.

24/7 Online Tutors

Tune up your concepts by asking our tutors any time around the clock and get prompt responses.