1 Million+ Step-by-step solutions

Each year, more than 2 million people in the United States become infected with bacteria that are resistant to antibiotics. In particular, the Centers of Disease Control and Prevention have launched studies of drug-resistant gonorrhea (CDC.gov). Of 142 cases tested in Alabama, 9 were found to be drug-resistant. Of 268 cases tested in Texas, 5 were found to be drug-resistant. Do these data suggest a statistically significant difference between the proportions of drug-resistant cases in the two states? Use a .02 level of significance. What is the p-value, and what is your conclusion?

Amazon.com is testing the use of drones to deliver packages for same-day delivery. In order to quote narrow time windows, the variability in delivery times must be sufficiently small. Consider a sample of 24 drone deliveries with a sample variance of s^{2 }= .81.

a. Construct a 90% confidence interval estimate of the population variance for the drone delivery time.

b. Construct a 90% confidence interval estimate of the population standard deviation.

In 2018, Mike Krzyewski and John Calipari topped the list of highest-paid college basketball coaches (Sports Illustrated website, https://www.si.com/college-basketball/2018/03/01/highest-paid-college-basketball -coaches-salaries-mike-krzyewski-john-calipari). The sample below shows the head basketball coach’s salary for a sample of 10 schools playing NCAA Division I basketball. Salary data are in millions of dollars.

a. Use the sample mean for the 10 schools to estimate the population mean annual salary for head basketball coaches at colleges and universities playing NCAA Division I basketball.

b. Use the data to estimate the population standard deviation for the annual salary for head basketball coaches.

c. What is the 95% confidence interval for the population variance?

d. What is the 95% confidence interval for the population standard deviation?

In 2017, Americans spent a record-high $9.1 billion on Halloween-related purchases (the balance website, https://www.thebalance.com /halloween-spending-statistics-facts-and-trends-3305716). Sample data showing the amount, in dollars, 16 adults spent on a Halloween costume are as follows.

a. What is the estimate of the population mean amount adults spend on a Halloween costume?

b. What is the sample standard deviation?

c. Provide a 95% confidence interval estimate of the population standard deviation for the amount adults spend on a Halloween costume.

The competitive advantage of small American factories such as Tolerance Contract Manufacturing lies in their ability to produce parts with highly narrow requirements, or tolerances, that are typical in the aerospace industry. Consider a product with specifications that call for maximum variance in the lengths of the parts of .0004. Suppose the sample variance for 30 parts turns out to be s^{2} = .0005. Use α = .05 to test whether the population variance specification is being violated.

In 2016, the Graduate Management Admission Council reported that the variance in GMAT scores was 14,660. At a recent summit, a group of economics professors met to discuss the GMAT performance of undergraduate students majoring in economics. Some expected the variability in GMAT scores achieved by undergraduate economics students to be greater than the variability in GMAT scores of the general population of GMAT takers. However, others took the opposite view. The file EconGMAT contains GMAT scores for 51 randomly selected undergraduate students majoring in economics.

graduate students majoring in economics. a. Compute the mean, variance, and standard deviation of the GMAT scores for the 51 observations.

b. Develop hypotheses to test whether the sample data indicate that the variance in GMAT scores for undergraduate students majoring in economics differs from the general population of GMAT takers.

c. Use α = .05 to conduct the hypothesis test formulated in part (b). What is your conclusion?

Barron’s has collected data on the top 1000 financial advisers. Merrill Lynch and Morgan Stanley have many of their advisers on this list. A sample of 16 of the Merrill Lynch advisers and 10 of the Morgan Stanley advisers showed that the advisers managed many very large accounts with a large variance in the total amount of funds managed. The standard deviation of the amount managed by the Merrill Lynch advisers was s_{1} = $587 million. The standard deviation of the amount managed by the Morgan Stanley advisers was s_{2} = $489 million. Conduct a hypothesis test at α = .10 to determine if there is a significant difference in the population variances for the amounts managed by the two companies. What is your conclusion about the variability in the amount of funds managed by advisers from the two firms?

OrderUp is a service that delivers food that its customers order online from participating restaurants. OrderUp claims consistent delivery times for its deliveries. A sample of 22 meal deliveries shows a sample variance of 1.5. Test to determine whether H_{0}: σ^{2} ≤ 1 can be rejected. Use α = .10

A sample of 9 days over the past six months showed that Philip Sherman, DDS, treated the following numbers of patients at his dental clinic: 22, 25, 20, 18, 15, 22, 24, 19, and 26. If the number of patients seen per day is normally distributed, would an analysis of these sample data reject the hypothesis that the variance in the number of patients seen per day is equal to 10? Use a .10 level of significance. What is your conclusion?

In an effort to make better use of its resources, the New York City Food Bank engaged in lean process improvement. This employee-driven kaizen effort resulted in a new method for packing meals for distribution to needy families. One goal of the process improvement effort was to reduce the variability in the meal-packing time. The following table summarizes information from a sample of data using the current method and the new method. Did the kaizen event successfully reduce the population variation? Use α = .10 and formulate the appropriate hypothesis test.

The Carnegie Classification of Institutes of Higher Education categorizes colleges and universities on the basis of their research and degree-granting activities. Universities that grant doctoral degrees are placed into one of three classifications: moderate research activity, higher research activity, or highest research activity. The Carnegie classifications for public and not-for-profit private doctoral degree-granting universities are summarized in the following table.

Test the hypothesis that the population proportions of public universities are equal in each Carnegie classification category. Use a .05 level of significance. What is the p-value and what is your conclusion?

Social media is becoming more and more popular around the world. Statista.com provides estimates of the number of social media users in various countries in 2017 as well as the projections for 2022. Assume that the results for surveys in the United Kingdom, China, Russia, and the United States are as follows.

a. Conduct a hypothesis test to determine whether the proportion of adults using social media is equal for all four countries. What is the p-value? Using a .05 level of significance, what is your conclusion?

b. What are the sample proportions for each of the four countries? Which country has the largest proportion of adults using social media?

c. Using a .05 level of significance, conduct multiple pairwise comparison tests among the four countries. What is your conclusion?

In 2015, Addison Group (a provider of professional staffing services) and Kelton (a global insights firm) surveyed the work preferences and attitudes of 1,006 working adults spread over three generations: baby boomers, Generation X, and millennials (Society for Human Resource Management website, https://www.shrm.org/resourcesandtools/hr-topics/talent-acquisition/pages /millennials-raises-promotions-generations.aspx). In one question, individuals were asked if they would leave their current job to make more money at another job. The file Millenials contains the sample data, which are also summarized in the following table.

Conduct a test of independence to determine whether interest in leaving a current job for more money is independent of employee generation. What is the p-value? Using a .05 level of significance, what is your conclusion?

The Wall Street Journal Annual Corporate Perceptions Study surveyed readers and asked how they rated the quality of management and the reputation of the company for more than 250 worldwide corporations. Both the quality of management and the reputation of the company were rated on a categorical scale of excellent, good, and fair categorical. Assume the sample data for 200 respondents below applies to this study.

a. Use a .05 level of significance and test for independence of the quality of management and the reputation of the company. What is the p-value and what is your conclusion?

b. If there is a dependence or association between the two ratings, discuss and use probabilities to justify your answer.

The race for the 2013 Academy Award for Actress in a Leading Role was extremely tight, featuring several worthy performances. The nominees were Jessica Chastain for Zero Dark Thirty, Jennifer Lawrence for Silver Linings Playbook, Emmanuelle Riva for Amour, Quvenzhané Wallis for Beasts of the Southern Wild, and Naomi Watts for The Impossible. In a survey, movie fans who had seen each of the movies for which these five actresses had been nominated were asked to select the actress who was most deserving of the 2013 Academy Award for Actress in a Leading Role. The responses follow.

a. How large was the sample in this survey?

b. Jennifer Lawrence received the 2013 Academy Award for Actress in a Leading Role for her performance in Silver Linings Playbook. Did the respondents favor Ms. Lawrence?

c. At α = .05, conduct a hypothesis test to determine whether people’s attitude toward the actress who was most deserving of the 2013 Academy Award for Actress in a Leading Role is independent of respondent age. What is your conclusion?

On a television program, two movie critics provide their reviews of recent movies and discuss. It is suspected that these hosts deliberately disagree in order to make the program more interesting for viewers. Each movie review is categorized as Pro (“thumbs up”), Con (“thumbs down”), or Mixed. The results of 160 movie ratings by the two hosts are shown here.

Use a test of independence with a .01 level of significance to analyze the data. What is your conclusion?

The manager at a Whole Foods Market is responsible for managing store inventory. The mathematical models that she uses to determine how much inventory to stock rely on product demand being normally distributed. In particular, the weekly demand of sriracha chili kale chips at a Whole Foods Market store is believed to be normally distributed. Use a goodness of fit test and the following data to test this assumption. Use α = .10.

In a 2018 study, Phoenix Marketing International identified Bridgeport, Connecticut; San Jose, California; Washington, D.C.; and Lexington Park, Maryland as the four U.S. cities with the highest percentage of millionaires (Kiplinger website, https://www.kiplinger.com/slideshow/investing/T064-S001-where-millionaires -live-in-america-2018/index.html). The following data show the following number of millionaires for samples of individuals from each of the four cities.

a. What is the estimate of the percentage of millionaires in each of these cities?

b. Using a .05 level of significance, test for the equality of the population proportion of millionaires for these four cities. What is the p-value and what is your conclusion?

Arconic Inc. is a producer of aluminum components for the avionics and automotive industries. At its Davenport Works plant, an engineer has conducted a quality-control test in which aluminum coils produced in all three shifts were inspected. The study was designed to determine if the population proportion of good parts was the same for all three shifts. Sample data follow.

a. Using a .05 level of significance, conduct a hypothesis test to determine if the population proportion of good parts is the same for all three shifts. What is the p-value and what is your conclusion?

b. If the conclusion is that the population proportions are not all equal, use a multiple comparison procedure to determine how the shifts differ in terms of quality. What shift or shifts need to improve the quality of parts produced?

As listed by The Art Newspaper Visitor Figures Survey (https://www.theartnewspaper.com/visitor-figur

As listed by The Art Newspaper Visitor Figures Survey (https://www.theartnewspaper.com/visitor-figures-2017), the five most-visited art museums in the world are the Louvre Museum, the National Museum in China, the Metropolitan Museum of Art, the Vatican Museums, and the British Museum. Which of these five museums would visitors most frequently rate as spectacular? Samples of recent visitors of each of these museums were taken, and the results of these samples follow.

a. Use the sample data to calculate the point estimate of the population proportion of visitors who rated each of these museums as spectacular.

b. Conduct a hypothesis test to determine if the population proportion of visitors who rated the museum as spectacular is equal for these five museums. Using a .05 level of significance, what is the p-value and what is your conclusion?

The Barna Group conducted a survey about church attendance. The survey respondents were asked about their church attendance and asked to indicate their age. Use the sample data to determine whether church attendance is independent of age. Using a .05 level of significance, what is the p-value and what is your conclusion? What conclusion can you draw about church attendance as individuals grow older?

Based on 2017 sales, the six top-selling compact cars are the Honda Civic, Toyota Corolla, Nissan Sentra, Hyundai Elantra, Chevrolet Cruze, and Ford Focus (New York Daily News, http://www.nydailynews.com/autos/street-smarts/best -selling-small-cars-2016-list-article-1.2945432). The 2017 market shares are: Honda Civic 20%, Toyota Corolla 17%, Nissan Sentra 12%, Hyundai Elantra 10%, Chevrolet Cruze 10%, and Ford Focus 8%, with other small car models making up the remaining 23%. A sample of 400 compact car sales in Chicago showed the following number of vehicles sold.

Honda Civic.................98

Toyota Corolla.................72

Nissan Sentra.................54

Hyundai Elantra.................44

Chevrolet Cruze.................42

Ford Focus.................25

Others.................65

Use a goodness of fit test to determine if the sample data indicate that the market shares for compact cars in Chicago are different than the market shares suggested by nationwide 2017 sales. Using a .05 level of significance, what is the p-value and what is your conclusion? If the Chicago market appears to differ significantly from the nationwide sales, which categories contribute most to this difference?

How long it takes paint to dry can have an impact on the production capacity of a business. In May 2018, Deal’s Auto Body & Paint in Prescott, Arizona, invested in a paint-drying robot to speed up its process (The Daily Courier website, https://www.dcourier.com/photos/2018/may/26/984960336/). An interesting question is, “Do all paint-drying robots have the same drying time?” To test this, suppose we sample five drying times for each of different brands of paint-drying robots. The time in minutes until the paint was dry enough for a second coat to be applied was recorded. The following data were obtained.

At the α = .05 level of significance, test to see whether the mean drying time is the same for each brand of robot.

Are there differences in airfare depending on which travel agency website you utilize? The following data were collected on travel agency websites on July 9, 2018. The following table contains the prices in U.S. dollars for a one-way ticket between the cities listed on the left for each of the three travel agency websites. Here the pairs of cities are the blocks and the treatments are the different websites. Use α = .05 to test for any significant differences in the mean price of a one-way airline ticket for the three travel agency websites.

In 2018, consumer goods giant Procter and Gamble (P&G) had more than 20 brands with more than $1 billion in annual sales (P&G website, https://us.pg.com/). How does a company like P&G create so many successful consumer products? P&G effectively invests in research and development to understand what consumers want. One method used to determine consumer preferences is called conjoint analysis. Conjoint analysis allows a company to ascertain the utility that a respondent in the conjoint study places on a design of a given product. The higher the utility, the more valuable a respondent finds the design. Suppose we have conducted a conjoint study and have the following estimated utilities (higher is preferred) for each of three different designs for a new whitening toothpaste.

At the .05 level of significance, test for any significant differences.

Based on a 2018 study, the average elapsed time between when a user navigates to a website on a mobile device until its main content is available was 14.6 seconds. This is more than a 20% increase from 2017 (searchenginejournal.com, https://www.searchenginejournal.com/). Responsiveness is certainly an important feature of any website and is perhaps even more important on a mobile device. What other web design factors need to be considered for a mobile device to make it more user friendly? Among other things, navigation menu placement and amount of text entry required are important on a mobile device. The following data provide the time it took (in seconds) randomly selected students (two for each factor combination) to perform a prespecified task with the different combinations of navigation menu placement and amount of text entry required.

Use the ANOVA procedure for factorial designs to test for any significant effects resulting from navigation menu position and amount of text entry required. Use α = .05.

A Pew Research study conducted in 2017 found that approximately 75% of Americans believe that robots and computers might one day do many of the jobs currently done by people (Pew Research website, http://www.pewinternet .org/2017/10/04/americans-attitudes-toward-a-future-in-which-robots-and-computers -can-do-many-human-jobs/). Suppose we have the following data collected from nurses, tax auditors, and fast-food workers in which a higher score means the person feels his or her job is more likely to be automated.

a. Use α = .05 to test for differences in the belief that a person’s job is likely to be automated for the three professions.

b. Use Fisher’s LSD procedure to compare the belief that a person’s job will be automated for nurses and tax auditors.

A factorial experiment was designed to test for any significant differences in the time needed to translate other languages into English with two computerized language translators. Because the type of language translated was also considered a significant factor, translations were made with both systems for three different languages: Spanish, French, and German. Use the following data for translation time in hours.

Test for any significant differences due to language translator, type of language, and interaction. Use α = .05.

The American Association of Individual Investors (AAII) On-Line Discount Broker Survey polls members on their experiences with discount brokers. As part of the survey, members were asked to rate the quality of the speed of execution with their broker as well as provide an overall satisfaction rating for electronic trades. Possible responses (scores) were no opinion (0), unsatisfied (l), somewhat satisfied (2), satisfied (3), and very satisfied (4). For each broker summary scores were computed by calculating a weighted average of the scores provided by each respondent. A portion of the survey results follow (AAII website).

a. Develop a scatter diagram for these data with the speed of execution as the independent variable.

b. What does the scatter diagram developed in part (a) indicate about the relationship between the two variables?

c. Develop the least squares estimated regression equation.

d. Provide an interpretation for the slope of the estimated regression equation.

e. Suppose Zecco.com developed new software to increase their speed of execution rating. If the new software is able to increase their speed of execution rating from the current value of 2.5 to the average speed of execution rating for the other 10 brokerage firms that were surveyed, what value would you predict for the overall satisfaction rating?

David’s Landscaping has collected data on home values (in thousands of $) and expenditures (in thousands of $) on landscaping with the hope of developing a predictive model to help marketing to potential new clients. Data for 14 households may be found in the file Landscape.

a. Develop a scatter diagram with home value as the independent variable.

b. What does the scatter plot developed in part (a) indicate about the relationship between the two variables?

c. Use the least squares method to develop the estimated regression equation.

d. For every additional $1000 in home value, estimate how much additional will be spent on landscaping.

e. Use the equation estimated in part (c) to predict the landscaping expenditures for a home valued at $575,000.

The following data were used to investigate the relationship between the number of cars in service (1000s) and the annual revenue ($millions) for six smaller car rental companies (Auto Rental News website).

With x = cars in service (1000s) and y = annual revenue ($ millions), the estimated regression equation is ŷ = −17.005 + 12.966x. For these data SSE = 1043.03.

a. Compute the coefficient of determination r^{2}.

b. Did the estimated regression equation provide a good fit? Explain.

c. What is the value of the sample correlation coefficient? Does it reflect a strong or weak relationship between the number of cars in service and the annual revenue?

Do students with higher college grade point averages (GPAs) earn more than those graduates with lower GPAs (CivicScience)? Consider the college GPA and salary data (10 years after graduation) provided in the file GPASalary.

a. Develop a scatter diagram for these data with college GPA as the independent variable. What does the scatter diagram indicate about the relationship between the two variables?

b. Use these data to develop an estimated regression equation that can be used to predict annual salary 10 years after graduation given college GPA.

c. At the .05 level of significance, does there appear to be a significant statistical relationship between the two variables?

Companies in the U.S. car rental market vary greatly in terms of the size of the fleet, the number of locations, and annual revenue. The following data were used to investigate the relationship between the number of cars in service (1000s) and the annual revenue ($ millions) for six smaller car rental companies (Auto Rental News website).

With x = cars in service (1000s) and y = annual revenue ($ millions), the estimated regression equation is yˆ = −17.005 + 12.966x. For these data SSE = 1043.03 and SST = 10,568. Do these results indicate a significant relationship between the number of cars in service and the annual revenue?

The data from exercise 1 follow.

a. Use equation (14.23) to estimate the standard deviation of yˆ* when x = 4.

b. Use expression (14.24) to develop a 95% confidence interval for the expected value of y when x = 4.

c. Use equation (14.26) to estimate the standard deviation of an individual value of y when x = 4.

d. Use expression (14.27) to develop a 95% prediction interval for y when x = 4.

Many small restaurants in Portland, Oregon, and other cities across the United States do not take reservations. Owners say that with smaller capacity, no-shows are costly, and they would rather have their staff focused on customer service rather than maintaining a reservation system (pressherald.com). However, it is important to be able to give reasonable estimates of waiting time when customers arrive and put their name on the waiting list. The file RestaurantLine contains 40 observations of number of people in line ahead of a customer (independent variable x) and actual waiting time (dependent variable y). The estimated regression equation is: yˆ 5 4.35 1 8.81x and MSE = 94.42.

a. Develop a point estimate for a customer who arrive with three people on the wait-list.

b. Develop a 95% confidence interval for the mean waiting time for a customer who arrives with three customers already in line.

c. Develop a 95% prediction interval for Roger and Sherry Davy’s waiting time if there are three customers in line when they arrive.

d. Discuss the difference between parts (b) and (c).

Sherry is a production manager for a small manufacturing shop and is interested in developing a predictive model to estimate the time to produce an order of a given size—that is, the total time to produce a certain quantity of the product. She has collected data on the total time to produce 30 different orders of various quantities in the file Setup.

a. Develop a scatter diagram with quantity as the independent variable.

b. What does the scatter diagram developed in part (a) indicate about the relationship between the two variables?

c. Develop the estimated regression equation. Interpret the intercept and slope.

d. Test for a significant relationship. Use .05.

e. Did the estimated regression equation provide a good fit?

Spring is a peak time for selling houses. The file SpringHouses contains the selling price, number of bathrooms, square footage, and number of bedrooms of 26 homes sold in Ft. Thomas, Kentucky, in spring 2018 (realtor.com website).

a. Develop scatter plots of selling price versus number of bathrooms, selling price versus square footage, and selling price versus number of bedrooms. Comment on the relationship between selling price and these three variables.

b. Develop an estimated regression equation that can be used to predict the selling price given the three independent variables (number of baths, square footage, and number of bedrooms).

c. It is argued that we do not need both number of baths and number of bedrooms. Develop an estimated regression equation that can be used to predict selling price given square footage and the number of bedrooms.

d. Suppose your house has four bedrooms and is 2650 square feet. What is the predicted selling price using the model developed in part c.

Revisit exercise 9, where we develop an estimated regression equation that can be used to predict the selling price given the number of bathrooms, square footage, and number of bedrooms in the house.

a. Does the estimated regression equation provide a good fit to the data? Explain.

b. In part c of exercise 9 you developed an estimated regression equation that predicts selling price given the square footage and number of bedrooms. Compare the fit for this simpler model to that of the model that also includes number of bathrooms as an independent variable.

The Honda Accord was named the best-midsized car for resale value for 2018 by the Kelley Blue Book (Kelley Blue Book website). The file AutoResale contains mileage, age, and selling price for a sample of 33 Honda Accords.

a. Develop an estimated regression equation that predicts the selling price of a used Honda Accord given the mileage and age of the car.

b. Is multicollinearity an issue for this model? Find the correlation between the independent variables to answer this question.

c. Use the F test to determine the overall significance of the relationship. What is your conclusion at the .05 level of significance?

d. Use the t test to determine the significance of each independent variable. What is your conclusion at the .05 level of significance?

Refer to Problem 25. Use the estimated regression equation from part (a) to answer the following questions.

a. Estimate the selling price of a four-year-old Honda Accord with mileage of 40,000 miles.

b. Develop a 95% confidence interval for the selling price of a car with the data in part (a).

c. Develop a 95% prediction interval for the selling price of a particular car having the data in part (a).

**Data From Problem 25**

The Honda Accord was named the best-midsized car for resale value for 2018 by the Kelley Blue Book (Kelley Blue Book website). The file AutoResale contains mileage, age, and selling price for a sample of 33 Honda Accords.

The Wall Street Journal asked Concur Technologies, Inc., an expense-management company, to examine data from 8.3 million expense reports to provide insights regarding business travel expenses. Their analysis of the data showed that New York was the most expensive city. The following table shows the average daily hotel room rate (x) and the average amount spent on entertainment (y) for a random sample of 9 of the 25 most visited U.S. cities. These data lead to the estimated regression equation yˆ = 17.49 + 1.0334x. For these data, SSE = 1541.4.

Predict the amount spent on entertainment for a particular city that has a daily room rate of $89.

The Cincinnati Zoo and Botanical Gardens had a record attendance of 1.87 million visitors in 2017 (Cincinnati Business Courier website). Nonprofit organizations such as zoos and museums are becoming more sophisticated in their use of data to improve the customer experience. Being able to better estimate expected revenue is one use of analytics that allows nonprofits to better manage their operations. The file ZooSpend contains sample data on zoo attendance. The file contains the following data on 125 visits by families to the zoo: amount spent, size of the family, the distance the family lives from the zoo (the gate attendee asks for the zip code of each family entering the zoo), and whether or not the family has a zoo membership (1 = yes, 0 = no).

a. Develop an estimated regression equation that predicts the amount of money spent by a family given family size, whether or not it has a zoo membership, and the distance the family lives from the zoo.

b. Test the significance of the zoo membership independent variable at the .05 level.

c. Give an explanation for the sign of the estimate you tested in part (b).

d. Test the overall significance of the model at the .05 level.

e. Estimate the amount of money spent in a visit by a family of five that lives 125 miles from the zoo and does not have a zoo membership.

For the holiday season of 2017, nearly 59 percent of consumers planned to buy gift cards. According to the National Retail Federation, millennials like to purchase gift cards (Dayton Daily News website). Consider the sample data in the file GiftCards. The following data are given for a sample of 600 millennials: the amount they reported spending on gift cards over the last year, annual income, marital status (1 = yes, 0 = no), and whether they are male (1 = yes, 0 = no).

a. Develop estimated regression equitation that predicts annual spend on gift cards are given annual income, marital status, and gender.

b. Test the overall significance at the .05 level.

c. Test the significance of each individual variable using a .05 level of significance.

Suppose that you have been hired by the commissioner of the LPGA to analyze the data for a presentation to be made at the annual LPGA Tour meeting. The commissioner has asked whether it would be possible to use these data to determine the performance measures that are the best predictors of a player’s average score. Use the methods presented in this and previous chapters to analyze the data. Prepare a report that summarizes your analysis, including key statistical results, conclusions, and recommendations.

The Ladies Professional Golf Association (LPGA) maintains data on performance for members of the LPGA Tour. Scoring average is generally considered the most important statistic in term of a player’s success. To investigate the relationship between scoring average and variables such as driving distance, driving accuracy, greens in regulation, sand saves, and average putts per round, year-end performance data for 140 players on the LPGA Tour for 2012 are contained in the DATAfile named TourLPGA2012 (LPGA website). Each row of the data set corresponds to a LPGA Tour player. Descriptions for the variables in the data set follow.

Scoring Average The average number of strokes per completed round. DrDist (Driving Distance) The average number of yards per measured drive. On the LPGA Tour driving distance is measured on two holes per round. Care is taken to select two holes which face in opposite directions to counteract the effect of wind. Drives are measured to the point at which they come to rest regardless of whether they are in the fairway or not.

DrAccu (Driving Accuracy) The percentage of time a tee shot comes to rest in the fairway (regardless of club). Driving accuracy is measured on every hole, excluding par 3s.

1. Develop a table that shows the number of wines that were classified as classic, outstanding, very good, good, mediocre, and not recommended and the average price. Does there appear to be any relationship between the price of the wine and the Wine Spectator rating? Are there any other aspects of your initial summary of the data that stand out?

2. Develop a scatter diagram with price on the horizontal axis and the Wine Spectator score on the vertical axis. Does the relationship between price and score appear to be linear?

3. Using linear regression, develop an estimated regression equation that can be used to predict the score given the price of the wine.

4. Using a second-order model, develop an estimated regression equation that can be used to predict the score given the price of the wine.

5. Compare the results from fitting a linear model and fitting a second-order model.

6. As an alternative to fitting a second-order model, fit a model using the natural logarithm of price as the independent variable. Compare the results with the second-order model.

7. Based upon your analysis, would you say that spending more for a bottle of wine will provide a better wine?

8. Suppose that you want to spend a maximum of $30 for a bottle of wine. In this case, will spending closer to your upper limit for price result in a better wine than a much lower price?

Wine Spectator magazine contains articles and reviews on every aspect of the wine industry, including ratings of wine from around the world. In a recent issue they reviewed and scored 475 wines from the Piedmont region of Italy using a 100-point scale. The following table shows how the Wine Spectator score each wine received is used to rate each wine as being classic, outstanding, very good, good, mediocre, or not recommended.

Score..........................Rating

95–100.......................Classic: a great wine

90–94..........................Outstanding: a wine of superior character and style

85–89..........................Very good: a wine with special qualities

80–84..........................Good: a solid, well-made wine

75–79..........................Mediocre: a drinkable wine that may have minor flaws

below 75....................Not Recommended

A key question for most consumers is whether paying more for a bottle of wine will result in a better wine. To investigate this question for wines from the Piedmont region we selected a random sample of 100 wines from the 475 wines that Wine Spectator reviewed. The data, contained in the file WineRatings, shows the price ($), the Wine Spectator score, and the rating for each wine.

Home Depot, a home-improvement retailer, sells several brands of washing machines. The following table contains a sample of 24 models of full-size washing machines sold by Home Depot in 2016, with each observation recording the washing machine capacity (in cubic feet) and the list price (in $) (Home Depot website).

a. Develop a scatter diagram for these data, treating cubic feet as the independent variable. Does a simple linear regression model appear to be appropriate?

b. Use a simple linear regression model to develop an estimated regression equation to predict the list price given the cubic feet. Construct a standardized residual plot. Based upon the standardized residual plot, does a simple linear regression model appear to be appropriate?

c. Using a second-order model, develop an estimated regression equation to predict the list price given the cubic feet.

As of September 4, 2016, the film Suicide Squad had an average rating of 3.7 out of 5 based on 117,323 viewer ratings (Rotten Tomatoes website). How are the viewer ratings of Suicide Squad related to the viewer age and the viewer ratings of The Secret Life of Pets? The file RottenTomatoes contains a sample of data containing viewer ages and their ratings of Suicide Squad and The Secret Life of Pets.

a. Develop a scatter diagram for these data with the users’ ages as the independent variable and their ratings of Suicide Squad as the dependent variable. Does a simple linear regression model appear to be appropriate?

b. Use the data provided to develop the regression equation for estimating the user ratings of Suicide Squad that is suggested by the scatter diagram in part (a).

c. Include the user rating of The Secret Life of Pets as an independent variable in the regression model developed in part (b). Interpret the regression coefficient for the user rating of The Secret Life of Pets.

d. Is the regression equation developed in part (b) or the regression equation developed in part (c) superior? Explain.

In 2016, the average monthly residential natural gas bill for Black Hills Energy customers in Cheyenne, Wyoming, is $67.95 (Wyoming Public Service Commission website). How is the monthly average gas bill for a Cheyenne home related to the square footage, number of rooms, and age of the home? The following data show the average monthly bill over the past year, square footage, number of rooms, and age for a sample of 20 Cheyenne homes.

a. Develop an estimated regression equation that can be used to predict a residence’s average monthly gas bill for last year given its age.

b. Develop an estimated regression equation that can be used to predict a residence’s average monthly gas bill for last year given its age, square footage, and number of rooms.

c. At the .05 level of significance, test whether the two independent variables added in part (b), the square footage and the number of rooms, contribute significantly to the estimated regression equation developed in part (a).

Mbuy is a media consulting firm that provides advice to companies on how to allocate their advertising budgets. Mbuy designed a factorial experiment to test the effect of the size of a banner ad on a website and the ad design on the number (in thousands) of product inquiries received. Three advertising designs and two sizes of advertisements were considered. The following data were obtained. Test for any significant effects due to type of design, size of advertisement, or interaction. Use α = .05.

The U.S. Department of Energy’s Fuel Economy Guide provides fuel efficiency data for cars and trucks (www.fueleconomy.gov). The file FuelEconomy2019 provides a portion of the data for 387 vehicles from the 2019 model year. The column labeled Class identifies the category of the vehicle (Two Seaters, Minicompact Cars, etc.). The column labeled Combined MPG shows the fuel efficiency rating based on 55% city driving and 45% highway driving in terms of miles per gallon. Use α = .05 and test for any significant difference in the mean fuel efficiency rating for highway driving among the 10 different classes of cars.

The general fund revenue receipts for the state of Kentucky for 2003 (period 1) to 2017 (period 15) are in the file KYRevenue (ky.gov website).

a. Construct a time-series plot. What type of pattern exists in the data?

b. Develop a linear trend equation for this time series.

c. What is the forecast for period 16?

Skechers is a performance footwear company headquartered in Manhattan Beach, California. The sales for Skechers (in billions of dollars) for 2012 (period 1) to 2017 (period 6) are in the file SkechersSales (annualreports.com).

a. Construct a time-series plot. What type of pattern exists in the data?

b. Develop a linear trend equation for this time series.

c. What is the forecast for sales for period 7?

The following data shows the average interest rate (%) for a 30-year fixed-rate mortgage over a ten-year period (FreddieMac website).

a. Construct a time series plot. Do you think a linear trend or a quadratic trend will provide a better fit for this time series? Why?

b. Develop the linear trend equation for this time series. Using this linear trend equation, forecast the average interest rate for period 11.

c. Develop the quadratic trend equation for this time series. Using this quadratic trend equation, forecast the average interest rate for period 11.

d. Compare your answers to parts (b) and (c). Which model would you recommend? Why?

The following data show the number of Netflix subscribers worldwide for the years 2012 (period 1) to 2017 (period 6) (datawrapper website). The data are in the file NetflixSubscribers.

a. Construct a time-series plot. What type of pattern exists in the data?

b. Develop a linear trend equation for this time series.

c. Develop a quadratic trend equation for this time series.

d. Compare the MSE for each model. Which model appears better according to MSE?

e. Use the models in part (b) and (c) to forecast subscribers for 2018.

f. Which of the two forecasts in part e would you use? Explain.

The following data show Google revenue from 2008 (period 1) to 2017 (period 10) in billions of dollars (Alphabet, Inc. annual reports). These data are in the file GoogleRevenue.

a. Construct a time-series plot. What type of pattern exists?

b. Develop a quadratic trend equation.

Annual retail store revenue for Apple from 2007 to 2017 are shown below (investorapple website).

a. Construct a time series plot. What type of pattern exists in the data?

b. Using statistical software, develop a linear trend equation for this time series.

c. Use the trend equation developed in part (b) to forecast retail store revenue for 2018.

The following data show the price in dollars for a general admission ticket to the Magic Kingdom at Disney World from the year 2000 (period 1) to 2017 (period 18) (Travel 1 Leisure website). These data are in the file DisneyPrices.

a. Construct a time-series plot. What type of pattern exists?

b. Develop an appropriate trend equation. Explain your choice of trend equation.

Hudson Marine provides boat sales, service, and maintenance. Boat trailers are one of its top sales items. The following table reports the number of trailers sold for the last seven years.

a. Construct a time series plot. Does a linear trend appear to be present?

b. Using statistical software, develop a linear trend equation for this time series.

c. Use the linear trend equation developed in part (b) to develop a forecast for annual sales in year 8.

1. Based on these data, should RainOrShine.Com warn its audience of seasonal differences in the numbers fatal lightning strikes in the United States? Use the a = .05 level of significance.

2. If the data suggest there are seasonal differences in fatal lightning strikes in the United States, during which season are fatal lightning strikes most common in the United States?

3. Are you concerned about RainOrShine.Com’s definitions of seasons? Explain why or why not.

RainOrShine.Com is an online provider of weather forecasts and information. The organization is putting together a weather-preparedness program to increase its audience’s understanding of severe weather. As part of this program, RainOrShine.Com would like to be able to warn its audience if there are seasonal differences in the number of fatal lightning strikes in the United States.

To test for possible seasonal differences, RainOrShine.Com has collected data from the National Weather Service, which maintains an online database that provides information on lightning strike fatalities by month. Because only monthly data are available, RainOrShine.Com has defined the four seasons as follows.

The data collected on the number of lightning strike fatalities for each season from 2008 through 2017 by RainOrShine.Com from the National Weather Service are provided in the following table.

In 2018 a survey of 10,508 CEOs by PayScale. com, the range of annual salaries reported was from $73,187 to $336,550. But do CEO salaries differ across the two most populous states in the United States? Consider the salaries for CEOs who work for companies headquartered in California and Texas as provided in the following table.

Use σ = .05 and test to determine whether the distribution of CEO salaries is the same for California and Texas. What is your conclusion?

Forty-minute workouts of one of the following activities three days a week will lead to a loss of weight, assuming no change in calories consumed. The following sample data show the number of calories burned during 40-minute workouts for three different activities. Do these data indicate differences in the amount of calories burned for the three activities? Use a .05 level of significance. What is your conclusion?

The National Football League (NFL) holds its annual draft of the nation’s best college football players in April each year. Prior to the draft, various sporting news services project the players who will be drafted along with the order in which each will be selected in what are called mock drafts. Players who are considered to have superior potential as professional football players are selected earlier in the draft. The following table shows, for the 2015 NFL draft, projections by one mock draft service of what position in the first round players from the Atlantic Coast Conference, the Big Ten Conference, the PAC12 Conference, and the Southeastern Conference will be selected follow (DraftSite website, https://www.draftsite.com/nfl/draft-history/2015/).

Use the Kruskal-Wallis test to determine if there is any difference among NFL teams for players from these four conferences. Use α = .05. What is the p-value? What is your conclusion?

According to the National Association of Realtors website (https://www.nar.realtor/sites/default/fil

According to the National Association of Realtors website (https://www.nar.realtor/sites/default/files/documents/metro -home-prices-q3-2017-single-family-2017-11-02.pdf), the national median sales price for single-family homes was $254,000 in 2018. Assume that the following data were obtained from samples of recent sales of single-family homes in St. Louis and Denver.

a. Is the median sales price in St. Louis significantly lower than the national median of $254,000? Use a statistical test with α = .05 to support your conclusion.

b. Is the median sales price in Denver significantly higher than the national median of $254,000? Use a statistical test with α = .05 to support your conclusion.

Nielsen Research provides weekly ratings of nationally broadcast television programs. The mean weekly number of viewers for the 207 prime-time programs broadcast by five major televisions networks (ABC, CBS, FOX, NBC, and CW) for the 2017–2018 television season are provided in the file Viewership2017-18. Shown in the following table are the mean weekly number of viewers for 12 shows in the file. Do these data suggest that the overall ratings for the five networks differ significantly? Use the Kruskal-Wallis test with a .10 level of significance. What is the p-value, and what is your conclusion?

In 2017, the American Red Cross had to make decisions related to preparations for Hurricane Irma, which was threatening the United States, including the state of Florida, Puerto Rico, and the U.S. Virgin Islands. To prepare for such disasters, organizations such as the Red Cross must make decisions about when and where to preposition relief supplies such as water, food, and medical supplies. Suppose that the Red Cross can choose to stock supplies for a possible hurricane that hits Florida either in a central distribution center that is protected from possible hurricane disaster or in regional distribution centers that are closer to where damage is expected but run the risk of being destroyed by severe hurricanes. The following table displays the costs (in $ millions) of the different decision alternatives under three possible states of nature: no hurricane landfall, moderate hurricane landfall, and severe hurricane landfall. Note that because these values represent costs, they are all displayed as negative values.

The probabilities for the states of nature are P(s1) 5 .5, P(s2) 5 .35, P(s3) 5 .15. The Red Cross can also wait an additional 48 hours during which time an additional “hurricane hunter” flight will collect additional data on the hurricane. By waiting, the Red Cross gathers additional sample data on whether the hurricane will make a turn toward or away from Florida. The probabilities associated with these are:

a. Construct a decision tree for this problem.

b. What is the recommended decision if the Red Cross does not wait to make a decision? What is the expected value of this decision?

c. What is the optimal decision strategy if the Red Cross waits an additional 48 hours? What is the expected value of this decision?

d. What is the expected value of the sample data?

The following table reports prices and usage quantities for two items in 2016 and 2018.

a. Compute price relatives for each item in 2018 using 2016 as the base period.

b. Compute an unweighted aggregate price index for the two items in 2018 using 2016 as the base period.

c. Compute a weighted aggregate price index for the two items using the Laspeyres method.

d. Compute a weighted aggregate price index for the two items using the Paasche method.

An item with a price relative of 132 cost $10.75 in 2018. Its base year was 2001.

a. What was the percentage increase or decrease in cost of the item over the 17-year period?

b. What did the item cost in 2001?

Fastenal, the largest fastener distributor in North America, procures an identical drive anchor from three independent suppliers that differ in unit price and quantity supplied. The relevant data for 2016 and 2018 are given in the following table, where quantity and unit price are expressed in terms of packages of 10 anchors.

a. Compute the price relatives for each of the component suppliers separately. Compare the price increases by the suppliers over the two-year period.

b. Compute an unweighted aggregate price index for the component part in 2018.

c. Compute a 2018 weighted aggregate price index for the component part. What is the interpretation of this index for Fastenal?

Registered nurses in 2007 made an average hourly wage of $30.04. In 2017, their hourly wage had risen to $35.36. Given that the CPI for 2007 was 207.3 and the 2017 CPI was 245.1, answer the following.

a. Give the real wages for registered nurses for 2007 and 2017 by deflating the hourly wage rates.

b. What is the percentage change in the nominal hourly wage for registered nurses from 2007 to 2017?

c. For registered nurses, what was the percentage change in real wages from 2007 to 2017?

The average hourly wage rate for construction laborers in 2001 was $13.36. In 2017 construction laborers made $18.70 per hour. The CPI for 2001 was 177.1 and for 2017, 245.1. Calculate the percentage change in real hourly wages from 2001 to 2017.

In 2017, Google’s revenue broke $100 billion for the first time. The revenue for Google for the years 2010–2017 is shown in the following table (Statista website). Deflate the revenue in dollars based on the CPI (1982–1984 base period). Comment on the company’s revenue in deflated dollars.

An automobile dealer reports the Year 1 and Year 8 sales for three models in the following table. Compute quantity relatives and use them to develop a weighted aggregate quantity index for Year 8 using the two years of data.

Many factors influence the retail price of gasoline. The following table shows the average retail price for a gallon of regular grade gasoline for each year from 2014 through 2017 (U.S. Energy Information Administration website).

a. Use 2014 as the base year and develop a price index for the retail price of a gallon of regular grade gasoline over this four-year period.

b. Use 2016 as the base year and develop a price index for the retail price of a gallon of regular grade gasoline over this four-year period.

Boran Stockbrokers, Inc., selects four stocks for the purpose of developing its own index of stock market behavior. Prices per share for a Year 1 base period, January of Year 3, and March of Year 3 follow. Base-year quantities are set on the basis of historical volumes for the four stocks.

Use the Year 1 base period to compute the Boran index for January of Year 3 and March of Year 3. Comment on what the index tells you about what is happening in the stock market.

Suppose on average a male shaver in Year 1 bought one razor handle and used 17 razor blades in a year and that the price relatives for Year 1 to Year 11 are as appears in the following table. Develop a Male Shaver Expense Index based on weighted price relatives for Year 11.

The operations of seafood restaurant chains such as Red Lobster are sensitive to changes in the price of seafood. Quantity data for a regional seafood chain coupled with price data are in the following table (Statista website).

a. Compute a price relative for each type of seafood.

b. Compute a weighted aggregate price index for the regional seafood chain. Comment on the change in seafood expense over the 16-year period.

Actuaries are analysts who specialize in the mathematics of risk. Actuaries often work for insurance companies and are responsible for setting premiums for insurance policies. Below are the median salaries for actuaries and the yearly CPI for four years. Use the CPI to deflate the salary data to constant dollars. Comment on the salary when viewed in constant dollars.

The closing price of Walmart stock at the end of its fiscal year (end of January) for five years is given in the following table. The CPI for January of each year is also provided. Deflate the stock price series and comment on the financial performance of Walmart stock.

Williams Sonoma is a consumer retail company that sells kitchenware. Williams Sonoma has reported the quantity and product value information for three different glass tumblers in two different years in the table that follows. Compute a weighted aggregate quantity index for the data. Comment on what this quantity index means.

Based on data from the U.S. Census Bureau, a Pew Research study showed that the percentage of employed individuals ages 25–29 who are college educated is at an all-time high. The study showed that the percentage of employed individuals aged 25–29 with at least a bachelor’s degree in 2016 was 40%. In the year 2000, this percentage was 32%, in 1985 it was 25%, and in 1964 it was only 16% (Pew Research website).

a. What is the population being studied in each of the four years in which Pew has data?

b. What question was posed to each respondent?

c. Do responses to the question provide categorical or quantitative data?

A Gallup Poll utilizing a random sample of 1,503 adults ages 18 or older was conducted in April 2018. The survey indicated a majority of Americans (53%) say driverless cars will be common in the next 10 years. The question asked was:

Thinking about fully automated, “driverless cars,” cars that use technology to drive and do not need a human driver, based on what you have heard or read, how soon do you think driverless cars will be commonly used in the [United States]?

Figure 1.7 shows a summary of results of the survey in a histogram indicating the percentage of the total responses in different time intervals.

a. Are the responses to the survey question quantitative or categorical?

b. How many of the respondents said that they expect driverless cars to be common in the next 10 years?

c. How many respondents answered in the range 16–20 years?

Figure 1.8 provides a bar chart showing the annual advertising revenue for Facebook from 2010 to 2017 (Facebook Annual Reports).

a. What is the variable of interest?

b. Are the data categorical or quantitative?

c. Are the data time series or cross-sectional?

d. Comment on the trend in Facebook’s annual advertising revenue over time

The U.S. Census Bureau tracks sales per month for various products and services through its Monthly Retail Trade Survey. Figure 1.9 shows monthly jewelry sales in millions of dollars for 2016.

a. Are the data quantitative or categorical?

b. Are the data cross-sectional or time series?

c. Which four months have the highest sales?

d. Why do you think the answers to part c might be the highest four months?

Pew Research Center is a nonpartisan polling organization that provides information about issues, attitudes, and trends shaping America. In a poll, Pew researchers found that 73% of teens aged 13–17 have a smartphone, 15% have a basic phone and 12% have no phone. The study also asked the respondents how they communicated with their closest friend. Of those with a smartphone, 58% responded texting, 17% social media and 10% phone calls. Of those with no smartphone, 25% responded texting, 29% social media and 21% phone calls (Pew Research Center website, October 2015).

a. One statistic (58%) concerned the use of texting to contact his/her closest friend, if the teen owns a smartphone. To what population is that applicable?

b. Another statistic (25%) concerned the use of texting by those who do not own a smartphone. To what population is that applicable?

c. Do you think the Pew researchers conducted a census or a sample survey to obtain their results? Why?

Consumer Reports evaluates products for consumers. The file CompactSUV contains the data shown in Table 1.8 for 15 compact sports utility vehicles (SUVs) from the 2018 model line (Consumer Reports website): Make—manufacturer Model—name of the model Overall score—awarded based on a variety of measures, including those in this data set Recommended—Consumer Reports recommends the vehicle or not Owner satisfaction—satisfaction on a five-point scale based on the percentage of owners who would purchase the vehicle again (– –, –, 0, +, ++). Overall miles per gallon—miles per gallon achieved in a 150-mile test trip Acceleration (0–60 sec)—time in seconds it takes vehicle to reach 60 miles per hour from a standstill with the engine idling

a. How many variables are in the data set?

b. Which of the variables are categorical, and which are quantitative?

c. What percentage of these 15 vehicles are recommended?

d. What is the average of the overall miles per gallon across all 15 vehicles?

e. For owner satisfaction, construct a bar chart similar to Figure 1.4.

f. Show the frequency distribution for acceleration using the following intervals: 7.0– 7.9, 8.0–8.9, 9.0–9.9, and 10.0–10.9. Construct a histogram similar to Figure 1.5.

Skechers U.S.A., Inc., is a performance footwear company headquartered in Manhattan Beach, California. The sales revenue for Skechers over a four-year period are as follows:

a. Are these cross-sectional or time-series data?

b. Construct a bar graph similar to Figure 1.2 B.

c. What can you say about how Skecher’s sales are changing over these four years?

**Figure 1.2 B**

The movie industry is a competitive business. More than 50 studios produce hundreds of new movies for theater release each year, and the financial success of each movie varies considerably. The opening weekend gross sales ($ millions), the total gross sales ($ millions), the number of theaters the movie was shown in, and the number of weeks the movie was in release are common variables used to measure the success of a movie released to theaters. Data collected for the top 100 theater movies released in 2016 are contained in the file Movies2016 (Box Office Mojo website). Table 2.20 shows the data for the first 10 movies in this file.

Managerial Report

Use the tabular and graphical methods of descriptive statistics to learn how these variables contribute to the success of a motion picture. Include the following in your report.

1. Tabular and graphical summaries for each of the four variables along with a discussion of what each summary tells us about the movies that are released to theaters.

2. A scatter diagram to explore the relationship between Total Gross Sales and Opening Weekend Gross Sales. Discuss.

3. A scatter diagram to explore the relationship between Total Gross Sales and Number of Theaters. Discuss.

4. A scatter diagram to explore the relationship between Total Gross Sales and Number of Weeks in Release. Discuss.

In alphabetical order, the six most common last names in the United States in 2018 are Brown, Garcia, Johnson, Jones, Smith, and Williams (United States Census Bureau website). Assume that a sample of 50 individuals with one of these last names provided the following data:

Summarize the data by constructing the following:

a. Relative and percent frequency distributions

b. A bar chart

c. A sorted bar chart

d. A pie chart

e. Based on these data, what are the three most common last names? Which type of chart makes this most apparent?

Nielsen Media Research tracks the top-rated television shows. The following data show the television network that produced each of the 25 top-rated shows in the history of television.

a. Construct a frequency distribution, percent frequency distribution, and bar chart for the data.

b. Which networks have done the best in terms of presenting top-rated television shows? Compare the performance of ABC, CBS, and NBC.

Many airlines use surveys to collect data on customer satisfaction related to flight experiences. Completing a flight, customers receive an email asking them to rate a variety of factors, including the reservation process, the check-in process, luggage policy, cleanliness of gate area, service by flight attendants, food/beverage selection, on-time arrival, and so on. Suppose that a five-point scale, with Excellent (E), Very Good (V), Good (G), Fair (F), and Poor (P), is used to record customer ratings. Assume that passengers on a Delta Airlines flight from Myrtle Beach, South Carolina, to Atlanta, Georgia, provided the following ratings for the question, “Please rate the airline based on your overall experience with this flight.” The sample ratings are shown below.

a. Use a percent frequency distribution and a bar chart to summarize these data. What do these summaries indicate about the overall customer satisfaction with the Delta flight?

b. The online survey questionnaire enabled respondents to explain any aspect of the flight that failed to meet expectations. Would this be helpful information to a manager looking for ways to improve the overall customer satisfaction on Delta flights? Explain.

Nearly 1.9 million bachelor’s degrees and over 758,000 master’s degrees are awarded annually by U.S. postsecondary institutions as of 2018 (National Center for Education Statistics website). The Department of Education tracks the field of study for these graduates in the following categories: Business (B), Computer Sciences and Engineering (CSE), Education (E), Humanities (H), Natural Sciences and Mathematics (NSM), Social and Behavioral Sciences (SBS), and Other (O). Consider the following samples of 100 graduates:

Bachelor’s Degree Field of Study

Master’s Degree Field of Study

a. Provide a percent frequency distribution of field of study for each degree.

b. Construct a bar chart for field of study for each degree.

c. What is the lowest percentage field of study for each degree?

d. What is the highest percentage field of study for each degree?

e. Which field of study has the largest increase in percentage from bachelor’s to masters’?

TripAdvisor is one of many online websites that provides ratings for hotels throughout the world. Ratings provided by 649 guests at the Lakeview Hotel can be found in the file HotelRatings. Possible responses were Excellent, Very Good, Average, Poor, and Terrible.

a. Construct a frequency distribution.

b. Construct a percent frequency distribution.

c. Construct a bar chart for the percent frequency distribution.

d. Comment on how guests rate their stay at the Sheraton Anaheim Hotel.

e. Suppose that results for 1679 guests who stayed at the Timber Hotel provided the following frequency distribution.

Compare the ratings for the Timber Hotel with the results obtained for the Lakeview Lodge.

Consider the following frequency distribution.

Construct a cumulative frequency distribution and a cumulative relative frequency distribution.

Based on the total passenger traffic, the airports in the following list are the 20 busiest airports in North America in 2018 (The World Almanac).

a. Which is busiest airport in terms of total passenger traffic? Which is the least busy airport in terms of total passenger traffic?

b. Using a class width of 10, develop a frequency distribution of the data starting with 30–39.9, 40–49.9, 50–59.9, and so on.

c. Prepare a histogram. Interpret the histogram.

The Flying Pig is a marathon (26.2 mile long) running race held every year in Cincinnati, Ohio. Suppose that the following data show the ages for a sample of 40 marathon runners

a. Construct a stretched stem-and-leaf display.

b. Which age group had the largest number of runners?

c. Which age occurred most frequently?

The U.S. Department of Energy’s Fuel Economy Guide provides fuel efficiency data for cars and trucks (Fuel Economy website). A portion of the data from 2018 for 341 compact, midsize, and large cars is shown in Table 2.13. The data set contains the following variables:

Size: Compact, Midsize, and Large

Displacement: Engine size in liters Cylinders: Number of cylinders in the engine

Drive: All wheel (A), front wheel (F), and rear wheel (R)

Fuel Type: Premium (P) or regular (R) fuel

City MPG: Fuel efficiency rating for city driving in terms of miles per gallon

Hwy MPG: Fuel efficiency rating for highway driving in terms of miles per gallon

The complete data set is contained in the file FuelData2018.

a. Prepare a crosstabulation of the data on Size (rows) and Hwy MPG (columns). Use classes of 20–24, 25–29, 30–34, 35–39, and 40–44 for Hwy MPG.

b. Comment on the relationship between Size and Hwy MPG.

c. Prepare a crosstabulation of the data on Drive (rows) and City MPG (columns). Use classes of 10–14, 15–19, 20–24, 25–29, and 30–34 for City MPG.

d. Comment on the relationship between Drive and City MPG.

e. Prepare a crosstabulation of the data on Fuel Type (rows) and City MPG (columns). Use classes of 10–14, 15–19, 20–24, 25–29, and 30–34 for City MPG.

f. Comment on the relationship between Fuel Type and City MPG.

The SAT is a standardized test used by many colleges and universities in their admission decisions. More than one million high school students take the SAT each year. The current version of the SAT includes three parts: reading comprehension, mathematics, and writing. A perfect combined score for all three parts is 2400. A sample of SAT scores for the combined three-part SAT are as follows:

a. Show a frequency distribution and histogram. Begin with the first class starting at 800 and use a class width of 200.

b. Comment on the shape of the distribution.

c. What other observations can be made about the SAT scores based on the tabular and graphical summaries?

Consumer complaints are frequently reported to the Better Business Bureau (BBB). Some industries against whom the most complaints are reported to the BBB are banks; cable and satellite television companies; collection agencies; cellular phone providers; and new car dealerships (USA Today). The results for a sample of 200 complaints are contained in the file BBB.

a. Show the frequency and percent frequency of complaints by industry.

b. Construct a bar chart of the percent frequency distribution. c. Which industry had the highest number of complaints?

d. Comment on the percentage frequency distribution for complaints.

Electric plug-in vehicle sales have been increasing worldwide. The table below displays data collected by the U.S. Department of Energy on electric plug-in vehicle sales in the words top markets in 2013 and 2015. (Data compiled by Argonne National Laboratory, U.S. Department of Energy website, https://www.energy.gov/eere/vehicles/fact-918-march-28-2016-global-plug-light-vehicle-sales-increased-about-80-2015)

a. Construct a side-by-side bar chart with year as the variable on the horizontal axis. Comment on any trend in the display.

b. Convert the above table to percentage allocation for each year. Construct a stacked bar chart with year as the variable on the horizontal axis.

c. Is the display in part (a) or part (b) more insightful? Explain

University endowments are financial assets that are donated by supporters to be used to provide income to universities. There is a large discrepancy in the size of university endowments. The following table provides a listing of many of the universities that have the largest endowments as reported by the National Association of College and University Business Officers in 2017.

Summarize the data by constructing the following:

a. A frequency distribution (classes 0–1.9, 2.0–3.9, 4.0–5.9, 6.0–7.9, and so on).

b. A relative frequency distribution.

c. A cumulative frequency distribution.

d. A cumulative relative frequency distribution.

e. What do these distributions tell you about the endowments of universities?

f. Show a histogram. Comment on the shape of the distribution.

g. What is the largest university endowment and which university holds it?

Join SolutionInn Study Help for

1 Million+ Textbook Solutions

Learn the step-by-step answers to your textbook problems, just enter our Solution Library containing more than 1 Million+ textbooks solutions and help guides from over 1300 courses.

24/7 Online Tutors

Tune up your concepts by asking our tutors any time around the clock and get prompt responses.