New Semester
Started
Get
50% OFF
Study Help!
--h --m --s
Claim Now
Question Answers
Textbooks
Find textbooks, questions and answers
Oops, something went wrong!
Change your search query and then try again
S
Books
FREE
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Tutors
Online Tutors
Find a Tutor
Hire a Tutor
Become a Tutor
AI Tutor
AI Study Planner
NEW
Sell Books
Search
Search
Sign In
Register
study help
business
statistical techniques in business
Statistical Methods For The Social Sciences 5th Edition Alan Agresti - Solutions
True or false?(a) Adjusted R2 can possibly decrease when an explanatory variable is added to a regression model.(b) Possible effects of an influential observation include changing a correlation from positive to negative, a P-value from 0.01 to 0.99, and R2 from 0.01 to 0.99.(c) When
Evidence of multicollinearity exists in a multiple regression fit when(a) Strong intercorrelations occur among explanatory variables.(b) The R2-value is very large.(c) The F test of H0: β1 = · · · = βk = 0 has a small P-value, but the individual t tests of H0: β1 = 0, . . . , H0: βk = 0 do
Forward selection and stepwise regression are similar in the sense that, if they have the same α-level for testing a term,(a) They always select the same final regression model.(b) They always select the same initial regression model(when they enter the first explanatory variable).(c) Any variable
The log transformation of the mean response in regression is useful when(a) E(y) is approximately a logarithmic function of x.(b) E(y) is approximately an exponential function of x.(c) logE(y) is approximately a linear function of x.(d) Unit changes in x have a multiplicative, rather than additive,
In the model E(y) = α + β1x + β2x2, the coefficientβ2(a) Is the mean change in y as x2 is increased one unit with x held constant.(b) Is a curvature coefficient that describes whether the regression equation is bowl shaped or mound shaped.(c) Equals 0 if the relationship between y and x is
You invest $1000 in an account with interest compounded annually at 10%.(a) How much money do you have after x years?(b) How long does it take your savings to double in size?
Example 14.8 showed a predicted U.S. population size (in millions) x decades after 1890 of ˆy =73.175(1.130)x.(a) Show this is equivalent to 1.23% predicted growth per year. [Hint: (1.0123)10 = 1.130.](b) Explain why the predictedU.S. population size x years after 1890 is 73.175(1.0123)x.
A recent newspaper article quoted a planner in a Florida city as saying, “This city has been growing at the rate of 4.2% per year. That’s not slow growth by any means. It corresponds to 42% growth per decade.” Explain what is incorrect about this statement. If, in fact, the current population
Using the formula s/s j(n − 1)(1 − R2j) for the standard error of the estimator of βj in multiple regression, explain how precision of estimation is affected by(a) Multicollinearity.(b) The conditional variability of the response variable.(c) The variability of the explanatory variables.(d)
You plan to model coital frequency in the previous month as a function of age, for a sample of subjects with ages between 20 and 90. For the ordinary bivariate model, explain what might be inappropriate about the(a) constant standard deviation assumption, (b) straightline assumption. State a model
Give an example of two variables that you expect to have a nonlinear relationship. Describe the pattern you expect for the relationship. Explain how to model that pattern.
A sociologist’s first reaction upon studying automated variable selection routines was that they had the danger of leading to “crass empiricism” in theory building.From a theoretical perspective, describe the dangers with such methods.What guidelines would you suggest for avoiding these
Give an example of a response variable and a pair of explanatory variables for which an automated variable selection procedure would probably produce a model with only one explanatory variable. Explain.
For the UN2 data file at the text website, using methods of this chapter,(a) Find a good model relating x = per capita GDP to y =life expectancy. (Hint: What does a plot of the data suggest?)(b) Find a good prediction equation for y = fertility. Explain how you selected variables for the model.
Table 14.15 shows the population size of Florida, by decade from 1830 to 2010. Analyze these data, which are the data file FloridaPop at the text website. Explain why a linear model is reasonable for the restricted period 1970–2010.
For the Mental data file at the text website and the model predicting mental impairment using life events and SES, conduct an analysis of residuals and influence diagnostics.
Analyze the Crime data set at the text website, deleting the observation for D.C., with y = violent crime rate. Use methods of this chapter. Prepare a report describing the analyses and diagnostic checks that you conducted, and indicate how you selected a model. Interpret results.
Refer to the data file the class created in Exercise 1.12. Select a response variable, pose a research question, and build a model using other variables in the data set.Interpret and summarize your findings.
Refer to the Students data file (Exercise 1.11).(a) Conduct and interpret a regression analysis using y =political ideology, selecting predictors from the variables in that file. Prepare a report describing the research question(s) posed and analyses and diagnostic checks that you conducted, and
Consider the fertility and GDP data in Table 14.6, from the FertilityGDP data file.(a) Using GLM software, fit the exponential regression model, assuming fertility rate has a (i) normal, (ii) gamma distribution. Interpret the effect of GDP on fertility rate for the gamma fit.(b) What advantages
For white men in the United States, Table 14.14 presents the number of deaths per thousand individuals of a fixed age within a period of a year.(a) Plot x = age against y = death rate and against log y.What do these plots suggest about a good model for the relationship?(b) Find the correlation
Consider the formula ˆy = 4(2)x.(a) Plot ˆy for integer x between 0 and 5.(b) Plot loge ˆy against x. Report the intercept and slope of this line.
Draw rough sketches of the following mathematical functions on the same set of axes, for x between 0 and 35.(a) ˆy = 6(1.02)x. (ˆy = predicted world population size in billions x years after 2000, if there is a 2% rate of growth every year.)(b) ˆy = 6(0.95)x. What does this represent?(c) Use
For United Nations data on y = world population size (billions) between 1900 and 2010, the exponential regression model with x = number of years since 1900 givesˆy = 1.4193(1.014)x.(a) Explain why the model fit corresponds to a rate of growth of 1.4% per year.(b) Show that the predicted population
For data shown in the article “Wikipedia: Modelling Wikipedia’s growth” at en.wikipedia.org, the number of English language articles inWikipedia was well approximated from 2001 to 2008 by ˆy = 22,700(2.1)x, where x is the time (in years) since January 1, 2001.(a) Interpret the values 22,700
For data from 2005 to 2011 from Facebook on y =number of people (in millions) worldwide using Facebook, the prediction equation ˆy = 2.13(2.72)x fits well, where x = number of years since January 1, 2005.(a) Predict the number using the Internet at the beginning of (i) 2005 (take x = 0), (ii)
The Crime2 data file at the text website illustrates how a single observation can be highly influential in determining whether the model should allow nonlinearity.(a) With all 51 observations, fit the quadratic model between y = murder rate and x = percentage in poverty.Test whether the quadratic
Refer to the previous exercise.(a) Using size as a straight-line predictor, r2 = 0.695, whereas R2 = 0.704 for the quadratic model. Is the degree of nonlinearity major, or minor? Is the linear association strong, or weak?(b) Test whether the quadratic model gives a significantly better fit than the
For the Houses data file, Table 14.13 shows results of fitting a quadratic regression model with s = size as the predictor.(a) Interpret the coefficients of this equation.What shape does it have?(b) Find the predicted selling price for homes with (i) s = 1000 square feet, (ii) s = 2000 square feet,
Sketch the following mathematical functions on the same set of axes, for values of x between 0 and 4. Use these curves to describe how the coefficients of x and x2 affect their shape.(a) ˆy = 10 + 4x (b) ˆy = 10 + 4x + x2(c) ˆy = 10 + 4x − x2 (d) ˆy = 10 − 4x(e) ˆy = 10 − 4x + x2 (f) ˆy
Table 14.12 shows the results of fitting two models to 54 observations on y = mental health score, x1 = degree of social interaction, and x2 = SES. The variables x1 and x2 are measured on scales of 0–100, and larger y-scores represent better mental health. The variable symbol x1**2 represents
Refer to the data from Example 14.7 on fertility rates and GDP (page 441). To allow for greater variation at higher values of mean fertility, fit a quadraticGLMwith a gamma distribution for fertility rate and the identity link function. Find the GDP value at which predicted fertility rate takes its
Refer to the plot of residuals in Figure 14.13 for Exercise 14.6.(a) Explain why a more valid fit may result from assuming that income has a gamma distribution, rather than a normal distribution.(b) Table 14.11 shows results for the normalGLMand the gamma GLM. Summarize how results differ for the
For a data set for 100 adults on y = height, x1 =length of left leg, and x2 = length of right leg, the model E(y) = α + β1x1 + β2x2 is fitted. Neither H0: β1 = 0 nor H0: β2 = 0 has a P-value below 0.05.(a) Does this imply that length of leg is not a good predictor of height?Why?(b) Does this
Three variables have population correlationsρx1x2= 0.85, ρyx1= 0.65, and ρyx2= 0.65. For these, the partial correlations are ρyx1·x2= ρyx2·x1= 0.244. In a sample, rx1x2= 0.90, ryx1= 0.70, and ryx2= 0.60, not far from the population values. For these, the sample partial correlations are
For the Houses2 data file, fit the model to y = selling price using house size, whether the house is new, and their interaction.(a) Show that the interaction term is highly significant.(b) Show that observation 5 is highly influential in affecting the fit in (a).(c) Show that the interaction effect
In Exercise 14.3, backward elimination and forward selection choose the model with explanatory variables SIZE, BATHS, and NEW.(a) Fit this model with the Houses2 data set. Inspect the leverages and the DFFIT and DFBETA values for SIZE.Refit the model without the three highly influential
For the Crime2 data file at the text website, fit the linear regression model with y = violent crime rate and x = percentage living in metropolitan areas, for all 51 observations.(a) Plot the studentized residuals. Are there any clear outliers?(b) Identify any observations with noticeable
For the data for 21 nations in the UN2 data file at the text website that are not missing observations on literacy, Table 14.10 shows various diagnostics from fitting the multiple regression model relating fertility (mean number of births per woman) to literacy rate and women’s economic
Figure 14.13 is a plot of the residuals versus the predicted y-values for the model discussed in Example 13.1 (page 390) relating income to education and racial–ethnic group.What does this plot suggest?
Use software with the Crime2 data file at the text website, excluding the observation for D.C. Let y = murder rate. For the five explanatory variables in that data file(excluding violent crime rate), with α = 0.10 in tests,(a) Use backward elimination to select a model. Interpret the result.(b)
Refer to the previous exercise. Using software with these four predictors, find the model that would be selected using the criterion. (a) R2 adj, (b) PRESS, (c) AIC.
For the Houses2 data file at the text website, Table 14.9 shows a correlation matrix and a model fit using four predictors of selling price.With these four predictors,(a) For backward elimination, which variable would be deleted first?Why?(b) For forward selection, which variable would be added
Table 11.23 (page 347) showed results of a multiple regression using nine predictors of the quality of life in a country.(a) In backward elimination with these nine predictors, can you predict which variable would be deleted (i) first?(ii) second? Explain.(b) In forward selection with these nine
For Example 11.2 (page 312) on y = mental impairment, x1 = life events, and x2 = SES, the multiple regression model has output Coef. Std. Error t Sig.(Constant) 28.230 2.174 12.984 .000 LIFE .103 .032 3.177 .003 SES -.097 .029 -3.351 .002 and the model allowing interaction has output Coef. Std.
A recent study9 examined the role of family structure in the financial support parents provide for their children’s college education. Using data for 5070 children from 1519 families from the Health and Retirement Study, one aspect of the study modeled the parents’ financial support of tuition
Summarize advantages of using a linear mixed model to analyze repeated-measures data, compared to using standard repeated-measures ANOVA.
Explain what is meant by the term mixed model, and explain the distinction between a fixed effect and a random effect.
Explain the reason for entering random effects into a regression model. Describe a study in which it would be helpful to use this approach.
13.33.* Using the graphical representation in Figure 13.10, explain why yi= yi+ b( ¯x − ¯xi), where b is the estimated slope. So, when b > 0, yi is adjusted upward if ¯x > ¯xi and adjusted downward if ¯xi < ¯x.
13.32.* Suppose we use a centered variable for the covariate and express the interaction model when the categorical factor has two categories as E(y) = α + β1(x − μx) + β2z + β3(x − μx) × z.Explain how to interpret β2, and explain how this differs from the interpretation for the model
Summarize the differences in purpose of a one-way analysis of variance and an analysis of covariance.
In the United States, the mean annual income for blacks (μ1) is smaller than for whites (μ2), the mean number of years of education is smaller for blacks than for whites, and annual income is positively related to number of years of education. Assuming that there is no interaction, the difference
In the model E(y) = α + β1x + β2z, where z = 1 for females and z = 0 formales,(a) The categorical factor has two categories.(b) One line has slope β1 and the other has slope β2.(c) β2 is the difference between the mean of y for females and males.(d) β2 is the difference between the mean of y
For a regression model fitted to annual income(thousands of dollars) using predictors age and marital status, Table 13.23 shows the sample mean incomes and the adjusted means. How could the adjusted means be so different from the unadjusted means? Draw a sketch to help explain.
Draw a scatterplot with sets of points representing two groups such that H0: equal means would be rejected in a one-wayANOVAbut not in an analysis of covariance.
Let y = death rate and x = mean age of residents, measured for each county in Louisiana and in Florida.Sketch a hypothetical scatterplot, identifying points for each state, when the mean death rate is higher in Florida than in Louisiana when mean age is ignored but lower when it is controlled.
In analyzing GSS data relating y = frequency of having sex in the past year to frequency of going to bars, DeMaris (2004, p. 62) noted that the slope for unmarried subjects is more than double the slope for married subjects.Introducing notation, state a model that you think would be appropriate.
You have two groups, and you want to compare their regressions of y on x, to test the hypothesis that the true slopes are identical for the two groups. Explain how to do this using regression modeling.
For the Crime2 data file at the text website, let z be a dummy variable for whether a state is in the South, with z = 1 for AL, AR, FL, GA, KY, LA, MD, MS, NC, OK, SC, TN, TX,VA, WV.(a) Not including the observation forD.C., analyze the relationship between y = violent crime rate and z, both
Analyze the Houses2 data file at the text website by modeling selling price in terms of size of house and whether it is new. (a) Fit the model allowing interaction, and test whether the interaction term is needed in the model.(b) Construct a scatterplot, identifying the points by whether the home
You plan a study of factors associated with fertility(a woman’s number of children) in a Latin American city.Of particular interest is whether migrants fromother cities or migrants from rural areas differ from natives of the city in their family sizes. The groups to be compared are urban natives,
Table 13.21 shows output for GSS data with y =index of attitudes toward premarital, extramarital, and homosexual sex, for which higher scores represent more permissive attitudes. The categorical explanatory variables are race (0 for whites, 1 for blacks), gender (0 for males, 1 for females), region
For the 2014 GSS, Table 13.20 shows estimates(with se values in parentheses) for four regression models for y = political party identification in the United States, scored from 1 = strong Democrat to 7 = strong Republican.The explanatory variables are sex (0 = male, 1 = female), race (0 = white, 1
An article8 on predicting attitudes toward homosexuality modeled a response variable with a four-point scale in which homosexual relations were scaled from 1 =always wrong to 4 = never wrong, with x1 = education(in years), x2 = age, x3 = political conservative (1 = yes, 0 = no), x4 = religious
Refer to the OECD data file at the text website, shown in Table 3.13 (page 58). Pose a research question about how the human development index and whether a nation is in Europe relate to carbon dioxide emissions.Conduct appropriate analyses to address that question, and prepare a report summarizing
Refer to the data file your class created in Exercise 1.12. For variables chosen by your instructor, use regression analysis as the basis of descriptive and inferential statistical analyses. Summarize your findings in a report in which you state the research question posed and describe and
Refer to the Students data file (Exercise 1.11).Using software, prepare a report presenting graphical, descriptive, and inferential analyses with(a) y = political ideology and the predictors religiosity and whether a vegetarian.(b) y = college GPA with predictors high school GPA, gender, and
Refer to the regression modeling of the familyclustered data in Table 13.13.Add to the Family data file the data for family 9, who had (y, x1, x2) values (0, 2, 0) and (1, 2, 1). Fit the linear mixed model to all the data, and interpret results.Concepts and Applications
Exercise 13.1 reported the regression equation relating y = education to race (z = 1 for whites) and to father’s education (x) of E(y) = 3 + 0.8x − 0.6z. The means ¯y = 11 for nonwhites, ¯y = 13 for whites, and overall¯y = 12.(a) Find the adjusted mean educational levels for whites and
Table 13.1 did not report the observations for 10 Asian Americans. Their (x, y) values were Subject 1 2 3 4 5 6 7 8 9 10 Education 16 14 12 18 13 12 16 16 14 10 Income 70 42 24 56 32 38 58 82 36 20(a) Conduct the analyses for the no-interaction model shown in Sections 13.2 and 13.4, after adding
Refer to the previous exercise. The means of percentage registered for the three categories are ¯x1 = 76.2,¯x2 = 49.5, and ¯x3 = 39.7.The overall mean ¯x = 60.4.(a) Find the adjusted mean of the percentage voting for Anglos. Compare it to the unadjusted mean of 52.3, and interpret.(b) Sketch a
The software outputs in Table 13.19 show results of fitting two models to data from a study of the relationship between y = percentage of adults voting, percentage of adults registered to vote, and racial–ethnic representation, for a random sample of 40 precincts in the state of Texas for a
Using software, replicate all the analyses shown in Sections 13.1 and 13.2 using the Income data file at the text website.
For the previous exercise, Table 13.18 shows results of fitting the model allowing interaction.(a) Report the lines relating the predicted selling price to the size for homes that are (i) new, (ii) not new.(b) Find the predicted selling price for a home of 3000 square feet that is (i) new, (ii) not
For the Houses data file at the text website, Table 13.17 shows results of modeling y = selling price (in dollars)in terms of size of home (in square feet) and whether the home is new (1 = yes; 0 = no).(a) Report and interpret the prediction equation, and form separate equations relating selling
Consider the results in the previous exercise.(a) Marital status has three estimates. Dividing the coefficient of the divorced dummy variable by its standard error yields a t statistic. What hypothesis does it test?(b) What would you need to do to test the effect of marital status (all categories
Based on a national survey, Table 13.16 shows results of a prediction equation for y = alcohol consumption, measured as the number of alcoholic drinks the subject drank during the past month.(a) For x = alcohol consumption three years ago and dummy variables f for whether father died in the past
For 2014 data, the GSS website yields the prediction equation ˆy = 9.59+0.166x1+0.347x2 for y = highest year of school completed, x1 = sex (1 = male, 2 = female), and x2 = highest year of mother’s education completed.(a) Interpret the estimated partial effects.(b) A more usual dummy coding for
A regression analysis for the 100th Congress predicted the proportion of each representative’s votes on abortion issues that took the “pro-choice” position.7 The prediction equation wasˆy = 0.350 + 0.011id + 0.094r + 0.005nw + 0.005inc+0.063s − 0.167p, where r = religion (1 for
Table 3.9 on page 53 showed data for several nations on y = C02 emissions (in metric tons per capita) and x =per capitaGDP(in thousands of dollars). Let z =whether the nation is in Europe (1 = yes, 0 = no).(a) The prediction equation for the effect of z is ˆy =10.61 − 2.48z. Interpret the
The regression equation relating y = education(number of years completed) to race (z = 1 for whites, z = 0 for nonwhites) in a certain country isE(y) = 11+2z.The regression equation relating education to race and to father’s education (x) is E(y) = 3 + 0.8x − 0.6z.(a) Ignoring father’s
12.44.* This exercise motivates the formula for the between-groups variance estimate in one-way ANOVA.Suppose the sample sizes all equal n and the population means all equal μ. The sampling distribution of each ¯yi then has mean μ and variance σ2/n. The sample mean of the ¯yi values is ¯y.(a)
Use the ANOVA applet at www.artofstat.com/webapps.html to illustrate how between-groups and within-groups variability affect the result of the ANOVA F test. Print results of two scenarios that result in relatively large and relatively small P-values.
Interaction terms are needed in a two-way ANOVA model when(a) Each pair of variables is associated.(b) Both explanatory variables have significant effects in the model without interaction terms.(c) The difference in means between two categories of one explanatory variable varies greatly among the
For four means, a multiple comparison method provides 95% confidence intervals for the differences between the six pairs. Then(a) For each confidence interval, there is a 0.95 chance that it contains the population difference.(b) P(all six confidence intervals are correct) = 0.70.(c) P(all six
One-way ANOVA provides relatively more evidence that H0: μ1 = · · · = μg is false(a) The smaller the between-groups variation and the larger the within-groups variation.(b) The smaller the between-groups variation and the smaller the within-groups variation.(c) The larger the between-groups
Analysis of variance and regression are similar in the sense that(a) They both assume a quantitative response variable.(b) They both have F tests for testing that the response variable is statistically independent of the explanatory variable(s).(c) For inferential purposes, they both assume that
True or false? Suppose that for subjects aged under 50, there is little difference in mean annual medical expenses for smokers and nonsmokers, but for subjects aged over 50 there is a large difference. Then, there is no interaction between smoking status and age in their effects on annual medical
Refer to Exercise 12.20.The students were also asked about their attitudes toward abortion. Each received a score according to how many from a list of eight possible reasons for abortion she would accept as a legitimate reason for a woman to seek abortion. Table 12.34 displays the scores,
The 25 women faculty in the humanities division of a college have a mean salary of $76,000, and the five women in the science division have a mean salary of$90,000. The 20 men in the humanities division have a mean salary of $75,000, and the 30 men in the science division have a mean salary of
Construct a numerical example of means for a twoway classification under the following conditions:(a) Main effects are present only for the row variable.(b) Main effects are present for each variable, with no interaction.(c) Interaction effects are present.(d) No effects of any type are present.
Table 7.29 (page 212) summarized a study that reported the mean number of dates in the past three months.For men, the mean was 9.7 for the more attractive and 9.9 for the less attractive. For women, the mean was 17.8 for the more attractive and 10.6 for the less attractive. Identify the response
For a two-way classification of means by factors A and B, at each level of B the means are equal for the levels of A. Does this imply that the overall means are equal at the various levels of A, ignoring B? Explain the implications, in terms of how results may differ between two-way ANOVA and
(a) Explain carefully the difference between a probability of Type I error of 0.05 for a single comparison of two means and a multiple comparison error rate of 0.05 for comparing all pairs of means.(b) In multiple comparisons following a one-way ANOVA with equal sample sizes, the margin of error
A study6 compared verbal memory of men and women for abstract words and for concrete words. It found a gender main effect in favor of women. It also reported,“There was no sex × word-type interaction (F =0.408, P = 0.525), indicating that women were equally advantaged on the two kinds of
A study5 described an experiment that randomly assigned participants to receive $3 to spend on themselves(self-interest), or to receive $3 to donate to a nonprofit charity (imposed charity), or to receive $3 that they could either spend on themselves or donate to charity (choice).After receiving or
Goto the GSS website sda.berkeley.edu/GSS.(a) Analyze the change over time (GSS variable YEAR)in the mean of political ideology (POLVIEWS) by political party identification (PARTYID). Compare strongRepublicans to strong Democrats in 1974 and in the latest survey, and summarize the rather dramatic
For y=number of times used public transportation in previous week and x = number of cars in family (which takes value 0, 1, or 2 for the given sample), explain the difference between conducting a test of independence of the variables using the ANOVA F test for comparing three means and using a
Refer to the Students data file (Exercise 1.11 on page 9), with response variable the number of weekly hours engaged in sports and other physical exercise. Using software, conduct an analysis of variance and follow-up estimation, and prepare a report summarizing your analyses and interpretations
Showing 3100 - 3200
of 5757
First
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Last
Step by Step Answers