# Question

R&D Expenses (introduced in Chapter 19) This data file contains a variety of accounting and financial values that describe companies operating in the information and professional services sectors of the economy. One column gives the expenses on research and development (R&D), and another gives the total assets of the companies. Both of these columns are reported in millions of dollars. This data table expands previous versions (introduced in Chapter 19) by adding data for professional services. To estimate regression models, we need to transform both expenses and assets to a log scale.

(a) Plot the log of R&D expenses on the log of assets for both sectors together in one scatterplot. Use color-coding or distinct symbols to distinguish the groups. Does it appear that the relationship is different in these two sectors or can you capture the association with a single simple regression? A common question asked when fitting models to subsets is “Do the equations for the two groups differ from each other?” For example, does the equation for the information sector differ from the equation for professional services? We’ve been answering this question informally, using the t- statistics for the slopes of the dummy variable and interaction. There’s just one small problem: We’re using two tests to answer one question. What’s the chance for a false-positive error? If you’ve got one question, better to use one test. To see if there’s any difference, we can use a variation on the F-test for R2. The idea is to test both slopes at once rather than separately. The method uses the change in the size of R2. If the R2 of the model increases by a statistically significant amount when we add both the dummy variable and interaction to the model, then something changed and the model is different. The form of this incremental, or partial,

F-test is

F = Change in R2 / number of added slopes / (1 – R2full) / (n – kfull – 1)

In this formula, kfull denotes the number of variables in the model with the extra features, including dummy variables and interactions. R2full is the R2 for that model. As usual, a big value for this F-statistic is 4.

(b) Add a dummy variable (coded as 0 for information companies and 1 for those in professional services) and its interaction with Log Assets to the model. Does the fit of this model meet the conditions for the MRM? Comment on the consequences of any problem that you identify.

(c) Assuming that the model meets the conditions for the MRM, use the incremental F-test to assess the size of the change in R2. Does the test agree with your visual impression? (The value of kfull for the model with dummy and interaction is 3, with - slopes added. You will need to fit the simple regression of Log R&D Expenses on Log Assets to get the R2 from this model.)

(d) Summarize the fit of the model that best captures what is happening in these two sectors.

(a) Plot the log of R&D expenses on the log of assets for both sectors together in one scatterplot. Use color-coding or distinct symbols to distinguish the groups. Does it appear that the relationship is different in these two sectors or can you capture the association with a single simple regression? A common question asked when fitting models to subsets is “Do the equations for the two groups differ from each other?” For example, does the equation for the information sector differ from the equation for professional services? We’ve been answering this question informally, using the t- statistics for the slopes of the dummy variable and interaction. There’s just one small problem: We’re using two tests to answer one question. What’s the chance for a false-positive error? If you’ve got one question, better to use one test. To see if there’s any difference, we can use a variation on the F-test for R2. The idea is to test both slopes at once rather than separately. The method uses the change in the size of R2. If the R2 of the model increases by a statistically significant amount when we add both the dummy variable and interaction to the model, then something changed and the model is different. The form of this incremental, or partial,

F-test is

F = Change in R2 / number of added slopes / (1 – R2full) / (n – kfull – 1)

In this formula, kfull denotes the number of variables in the model with the extra features, including dummy variables and interactions. R2full is the R2 for that model. As usual, a big value for this F-statistic is 4.

(b) Add a dummy variable (coded as 0 for information companies and 1 for those in professional services) and its interaction with Log Assets to the model. Does the fit of this model meet the conditions for the MRM? Comment on the consequences of any problem that you identify.

(c) Assuming that the model meets the conditions for the MRM, use the incremental F-test to assess the size of the change in R2. Does the test agree with your visual impression? (The value of kfull for the model with dummy and interaction is 3, with - slopes added. You will need to fit the simple regression of Log R&D Expenses on Log Assets to get the R2 from this model.)

(d) Summarize the fit of the model that best captures what is happening in these two sectors.

## Answer to relevant Questions

Cars (introduced in Chapter 19) The cases that make up this dataset are types of cars. For each of 318 types of cars sold in the United States during the 2011 model year, we have the combined mileage and the horsepower of ...Many airlines offer credit cards that reward customers who use the card with frequent-flyer miles. The more the customer uses the card, the more miles earned. Do these cards work? Do customers who get such a card fly more on ...The analysis in Exercise 25 uses South America as the omitted category. What would change and what would be the same had the analysis used the United States as the omitted reference category? Suppose an ANOVA meets the conditions of the MRM and the F-test rejects the overall null hypothesis that five groups have equal means. If the Bonferroni confidence interval (adjusted for pairwise comparisons) for μ1 - μ2 ...A real estate company operates 25 offices scattered around the southeastern United States. Each office employs six agents. Each month, the CEO receives a summary report of the average value of home sales per agent for every ...Post your question

0