New Semester
Started
Get
50% OFF
Study Help!
--h --m --s
Claim Now
Question Answers
Textbooks
Find textbooks, questions and answers
Oops, something went wrong!
Change your search query and then try again
S
Books
FREE
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Tutors
Online Tutors
Find a Tutor
Hire a Tutor
Become a Tutor
AI Tutor
AI Study Planner
NEW
Sell Books
Search
Search
Sign In
Register
study help
business
statistical techniques in business
Statistical Methods For The Social Sciences 5th Edition Alan Agresti - Solutions
The text website has a data file Houses that lists recent selling prices of 100 homes in Gainesville, Florida.Software reports ¯y = $155, 331, s = $101, 262, and a five-number summary of minimum = $21, 000, Q1 =$91, 875, median = $132,600, Q3 = $173, 875, and maximum= $587,000.(a) Does the
Table 3.17 shows part of software output for analyzing the murder rates (per 100,000) in the Crime2 data file at the text website (to be analyzed in Chapter 9). The first column refers to the entire data set, and the second column deletes the observation for D.C. For each statistic reported,
Refer to the previous exercise.(a) Sketch a box plot.(b) Based on (a), predict the direction of skew for this distribution.Explain.(c) If the distribution, although skewed, is approximately bell shaped, which value is most plausible for the standard deviation:(i) 100, (ii) 1000, (iii) 7000, (iv)
Arecent study5 of the effect of work hours and commuting time on political participation estimated that for those engaged in paid work in the United States, the time it takes on a typical day to get to work has a mean of 19.8 minutes and standard deviation of 13.6 minutes.What shape do you expect
For all homes in Gainesville, Florida, the annual residential electrical consumption4 recently had a mean of 10,449 and a standard deviation of 7489 kilowatt-hours(kWh). The maximum usage was 336,240 kWh.(a) What shape do you expect this distribution to have?Why?(b) Do you expect this distribution
According to the U.S. Census Bureau, the U.S. nationwide mean selling price of new homes sold in 2014 was $345,800.Which of the following is the most plausible value for the standard deviation:(i) −15,000, (ii) 1500, (iii) 15,000, (iv) 150,000, (v)1,500,000?Why?
Grade point averages of graduating seniors at the University of Rochester must fall between 2.0 and 4.0.Consider the possible standard deviation values: −10.0, 0.0, 0.4, 2.0, 6.0.(a) Which is the most realistic value?Why?(b) Which value is impossible? Why?
The first exam in your Statistics course is graded on a scale of 0 to 100, and the mean is 76. Which value is most plausible for the standard deviation: −20, 0, 10, or 50?Why?
For GSS data on “the number of people you know who have committed suicide,” 88.8% of the responses were 0, 8.8% were 1, and the other responses took higher values. The mean equals 0.145, and the standard deviation equals 0.457.(a) What percentage of observations fall within one standard
Excluding the United States, the national mean number of holiday and vacation days in a year for OECD nations (see Exercise 3.6) is approximately bell shaped with a mean of 35 days and standard deviation of 3 days.3(a) Use the Empirical Rule to describe the variability.(b) The observation for the
A report indicates that teacher’s total annual pay(including bonuses) in Toronto, Ontario, has a mean of$61,000 and standard deviation of $10,000 (Canadian dollars).Suppose the distribution has approximately a bell shape.(a) Give an interval of values that contains about (i) 68%,(ii) 95%, (iii)
The HumanDevelopment Report 2014, published by the UN, showed life expectancies by country. ForWestern Europe, the values reported were Denmark 79, Portugal 80, Netherlands 81, Finland 81, Greece 81, Ireland 81, United Kingdom 81, Belgium 81, France 82, Germany 81, Norway 82, Italy 82, Spain 82,
The Human Development Index (HDI) is an index the United Nations uses to give a summary rating for each nation based on life expectancy at birth, educational attainment, and income. In 2014, the 10 nations (in order)with the highestHDIrating, followed by the percentage of seats in their parliament
National Geographic Traveler magazine recently presented data on the annual number of vacation days averaged by residents of eight different countries. They reported 42 days for Italy, 37 for France, 35 for Germany, 34 for Brazil, 28 for Britain, 26 for Canada, 25 for Japan, and 13 for the United
As of May 2015, an article in en.wikipedia.org on “Minimum wage” reported (in U.S. dollars) the minimum wage per hour for five nations: $15.61 in Australia,$12.52 in France, $9.85 in Canada, $7.25 in the United States, and $0.62 in Mexico. Find the mean, range, and standard deviation (a)
The General Social Survey has asked, “During the past 12 months, how many people have you known personally that were victims of homicide?” Table 3.16 shows software output from analyzing responses.(a) Is the distribution bell shaped, skewed to the right, or skewed to the left?(b) Does the
According to the U.S. Bureau of the Census, in 2013 in the United States the median family income was$72,624 for white families, $41,505 for black families, and$42,269 for Hispanic families.(a) Identify the response variable and the explanatory variable for this analysis.(b) Is enough information
According to the U.S. Bureau of the Census, the 2013 median personal earnings in the past 12 months were$22,063 for females and $35,228 for males, whereas the mean was $31,968 for females and $50,779 for males.(a) Does this suggest that the distribution of income for each gender is symmetric, or
The 2014 GSS asked respondents how many days a week they read a newspaper. The possible responses were(every day, a few times a week, once a week, less than once a week, never), and the counts in those categories were(417, 260, 246, 271, 481), for percentages (24.9, 15.5, 14.7, 16.2, 28.7).(a)
Table 3.15 summarizes responses of 2223 subjects in the 2014 GSS to the question “About how often did you have sex during the last 12 months?”(a) Report the median and the mode. Interpret.(b) Treat this scale in a quantitative manner by assigning the scores 0, 0.1, 1.0, 2.5, 4.3, 10.8, and 17
According to Statistics Canada, for the Canadian population having income in 2010, the median was$29,878 and the mean was $40,650. What would you predict about the shape of the distribution?Why?
Table 3.14 shows 2012 female economic activity(FEA) for countries in Eastern Europe. Construct plots and find summary statistics to compare these values with those from the Middle East in Table 3.4.Interpret.
Access the GSS at sda.berkeley.edu/GSS. Entering TVHOURS for the variable and year(2014) in the selection filter, you obtain data on hours per day of TV watching in the United States in 2014.(a) Construct the relative frequency distribution for the values 0, 1, 2, 3, 4, 5, 6, 7 or more.(b) How
A researcher in an alcoholism treatment center, to study the length of stay in the center for first-time patients, randomly selects 10 records of individuals institutionalized within the previous two years. The lengths of stay, in days, were 11, 6, 20, 9, 13, 4, 39, 13, 44, and 7. For a similar
A Roper organization survey asked, “How far have environmental protection laws and regulations gone?”For the possible responses (not far enough, about right, too far), the percentages of responses were 51%, 33%, and 16%.(a) Which response is the mode?(b) Can you compute a mean or a median for
Global warming seems largely a result of human activity that produces carbon dioxide emissions and other greenhouse gases. From data.worldbank.org, emissions(per capita) in 2010–2014 for the eight largest countries in population size were in metric tons (1000 kilograms)per person: Bangladesh 0.4,
Refer to the prison values in the previous exercise.(a) Find the mean and the median.(b) Based on a histogram or box plot for these data, why would you expect the mean to be larger than the median?(c) Identify an outlier. Investigate how it affects the mean and the median by recalculating them
The OECD (Organization for Economic Cooperation and Development) consists of advanced, industrialized countries that accept the principles of representative democracy and a free market economy. Table 3.13 shows part of the OECD data file at the text website that has data on several variables for
Create a data file with your software for the Crime data file from the text website. Use the variable murder, which is the murder rate (per 100,000 population). Using software,(a) Construct a relative frequency distribution.(b) Construct a histogram. How would you describe the shape of the
According to the 2015 American Community Survey, in 2012 the United States had 30.1 million households with one person, 37.1 million with two persons, 17.8 million with three persons, 15.0 million with four persons, and 10.4 million with five or more persons.(a) Construct a relative frequency
A teacher shows her class the scores on the midterm exam in the stem-and-leaf plot:6 | 5 8 8 7 | 0 1 1 3 6 7 7 9 8 | 1 2 2 3 3 3 4 6 7 7 7 8 9 9 | 0 1 1 2 3 4 4 5 8(a) Identify the number of students and the minimum and maximum scores.(b) Sketch a corresponding histogram with four intervals.
According to the 2013–2014 edition of The World Factbook, the number of followers of the world’s four largest religions was 2.2 billion for Christianity, 1.6 billion for Islam, 1.0 billion for Hinduism, and 0.5 billion for Buddhism.(a) Construct a relative frequency distribution.(b) Sketch a
Table 3.12 shows the number (in millions) of the foreign-born population of the United States, by place of birth.(a) Construct a relative frequency distribution.(b) Sketch the data in a bar graph.(c) Is “place of birth” quantitative, or categorical?(d) Use whichever of the following measures is
A logistic regression model describes how the probability of voting for the Republican candidate in a U.S.presidential election depends on x = voter’s total family income (in thousands of dollars) in the previous year. The sample prediction equation is(a) Identify ˆβ and interpret its sign.(b)
A topic not covered in this book is meta-analysis, which refers to quantitative summaries of the relevant research studies on a particular topic. With an Internet search, find a published meta-analysis about a topic in the social sciences. Describe the purpose of the metaanalysis, the statistical
Using your software, attempt to replicate the Bayesian analysis shown in Table 16.9.Perform Bayesian statistical inference for the SES effect.
Write a 100-word summary of the difference between the frequentist and Bayesian approaches to statistical inference.
A variable is measured at three times, y1 at time 1, y2 at time 2, and y3 at time 3. Suppose the chain relationship holds, with y1 affecting y2, which in turn affects y3. Does this sequence of observations satisfy Markov dependence?Explain.
What is wrong with this statement: “For a Markov chain model, yt is independent of yt−2”?
Construct a diagram representing a covariance structure model for the following: A religiosity factor is based on two indicators from the GSS about frequency of church attendance and frequency of praying. An education factor is based on two indicators from the GSS about educational attainment and
Construct a diagram representing a covariance structure model for the following: In the measurement model, a single factor represents violent crime rate and murder rate and a single factor represents percentage of high school graduates, percentage in poverty, and percentage of single-parent
Construct a diagram representing the following covariance structure model, for variables measured for each state. The latent response variable is based on two observed indicators, violent crime rate and murder rate. The two explanatory variables for that latent variable are the observed values of
Refer to the previous exercise. The authors also formulated a “daily grind model” as a structural equations model. Describe this model, in terms of the study’s latent and observed variables.
A recent study7 analyzed the effect of work hours and commuting time on political participation.Read the Data and Method section of the article at http://apr.sagepub.com/content/42/1/141 and describe how the authors used factor analysis to construct a response variable measuring political
Refer to Example 11.1 (page 308) on data for the 67 counties in Florida on y = crime rate, x1 = percentage of high school graduates, and x2 = percentage living in an urban environment. Consider the spurious causal model for the association between crime rate and percentage of high school graduates,
The Crime2 data file at the text website has data on murder rate, percentage urban, percentage of high school graduates, and percentage in poverty. Construct a realistic path diagram for these variables. By fitting the appropriate models for these data (deleting the observation for D.C.), estimate
UN data are available for most nations on B =birth rate, G = per capita gross domestic product, L =percentage literate, T = percentage of homes having a television, andC =percentage using contraception. Draw a path diagram relating these variables. Specify the regression models you would need to
Let I = annual income, E = attained educational level, J = number of years of experience in job, M = motivation, A = age, G = gender, and P = parents’ attained educational level. Construct a path diagram showing your opinion about the likely relationships among those variables.Specify the
In studying the effect of race on job dismissals in the federal bureaucracy, a study6 used event history analysis to model the hazard rate regarding termination of employment.In modeling involuntary terminations using a sample of size 2141, they reported P < 0.001 in significance tests for the
A study of recidivism takes a sample of records of people who were released from prison in 2010. The response variable, measured when records are reviewed in 2017, is the number of months until the person was rearrested.In the context of this study, explain what is meant by a censored observation.
For Table 16.5 on page 505, interpret the estimated effect of gender on the hazard rate. Test the effect of race, and interpret.
A recent study5 used multilevel models to analyze life-course changes in contact between parents and their adult children. You can access the article at http://roa.sagepub.com/content/36/5/568. Prepare a one-page summary of the multilevel model formulated in their Data and Methods section. In your
Using software, replicate the results for the multilevel analysis of the smoking prevention study in Example 16.2, and interpret.
Explain the purpose of using a multilevel model. Illustrate with an example.
For Example 16.1 on mental impairment, give an example of an amended data file with missing data that would suggest that those data were not completely missing at random.
For Example 16.1 on mental impairment, take the complete Mental data file and randomly select 10 observations for which you act as if the life events values are actually missing. Fit the multiple regression model to the 30 observations with no missing data, and then use multiple imputation to fit
In Example 13.11 (page 410) on quality of life with treatments for alcohol dependence, suppose that subjects who drop out of the study become, over time, less financially satisfied.(a) Explain why the missing at random assumption would be violated.(b) Explain why the time effect may be
15.40.* Logistic regression has infinite maximum likelihood estimates when the cases with y = 1 are separate from the cases with y = 0 in the space of explanatory variable values.When this happens, most software merely reports large estimates with huge standard errors. Check what your software does
15.39.* For a two-way contingency table, let ri denote the ith row total, let cj denote the jth column total, and let n denote the total sample size. Section 8.2 (page 218) stated that the cell in row i and column j has fe = ricj/n for the independence model. Show that the log of the expected
15.38.* For the logistic regression model, from the linear approximation β/4 for the rate of change in the probability at the x-value for which P(y = 1) = 0.50, show that 1/|β| is the approximate distance between the x-values at which P(y = 1) = 1/4 (or P(y = 1) = 3/4) and at which P(y = 1) =
State the symbols for the loglinear models for categorical variables that are implied by the causal diagrams in Figure 15.5.
For a person, let y = 1 represent death during the next year and y = 0 represent survival. For adults in the United Kingdon and in the United States, the probability of death is well approximated by the model, logit[P(y =1)] = −10.5 + 0.1x, where x = age in years. Show how the probability of
For Table 15.4 (page 467), show that the association between the defendant’s race and the death penalty verdict satisfies Simpson’s paradox.What causes this?
Analyze the data in Exercise 8.16 (page 241) on happiness and marital status using a cumulative logit model. Interpret the results in a report of about 200 words.
A report (www.oas.samhsa.gov) by the Office of Applied Studies for the Substance Abuse and Mental Health ServicesAdministration about factors that predict marijuana use stated, “Multiple logistic regression also confirmed that the risk of recent marijuana initiation increased with increasing age
Astudy13 compared the relative frequency of mental health problems of various types among U.S. Army members before deployment to Iraq,U.S. Army members after serving in Iraq, U.S. Army members after serving in Afghanistan, and U.S. Marines after serving in Iraq. The study stated, “Potential
Explain how to interpret the results in this table. The study abstract12 stated, “Men are 10 times more likely to hunt wildlife than females.” Comment on how this conclusion was reached, and whether it is correct.Which explanatory variables other than sex seem as if they are important?
A Canadian survey of factors associated with whether a person is a hunter of wildlife showed the results in Table
In a study of whether an educational program makes sexually active adolescents more likely to obtain condoms, adolescents were randomly assigned to two experimental groups. The educational program, involving a lecture and videotape about transmission of the HIV virus, was provided to one group but
One year, the Metropolitan Police in London, England, reported11 30,475 people as missing in the year ending March 1993. For those of age 13 or less, 33 of 3271 missing males and 38 of 2486 missing females were still missing a year later. For ages 14–18, the values were 63 of 7256 males and 108
The data shown in Exercise 10.14 in Chapter 10 came from an early study on the death penalty and racial characteristics. Analyze those data using methods of this chapter. Summarize your main findings in a way that you could present to the general public, using as little technical jargon as possible.
In a one-page report, analyze Table 15.7 by treating party affiliation as the response variable and political ideology as a quantitative explanatory variable. Fit an appropriate model, conduct statistical inference, and interpret results. Attach annotated software output to your report.
Refer to the Students data file (Exercise 1.11).Using software, conduct and interpret a logistic regression analysis using y = opinion about abortion with explanatory variables(a) Political ideology.(b) Sex and political ideology.
Refer to the survey data for high school seniors in Table 15.12 and the goodness-of-fit statistics reported in Table 15.18 (page 487). Use these results to illustrate (a)when a model fits well and when a model fits poorly, (b)how G2 decreases as the model becomes more complex.
For Table 15.3 on the death penalty, the logistic model that has an effect of victims’ race but assumes that the death penalty is independent of defendant’s race(given victims’ race) has a Pearson goodness-of-fit statistic equal to 5.81 with df = 2 (P-value 0.055). Specify H0 for this test,
For a four-way cross-classification of variablesw, x, y, and z, state the symbol for the loglinear model in which(a) All pairs of variables are independent.(b) x and y are associated, but other pairs of variables are independent.(c) All pairs of variables are associated, but the conditional
Refer to the loglinear model analyses reported in Examples 15.7 and 15.8 for use of marijuana, alcohol, and cigarettes. Use software to replicate all the analyses shown there.
Consider the fit of the loglinear model (AC, AM,CM) to Table 15.12 for the survey of high school seniors.(a) Use the estimated expected frequencies in Table 15.14 to estimate the conditional odds ratios between A and M at each level of C.(b) Show how to obtain the estimated odds ratio in (a)fromthe
Using software, replicate the results in Example 15.6 (page 478) on belief in an afterlife, sex, and race.
Refer to the 3×7 table in Table 12.1 (page 352) on party identification and political ideology.(a) Fit a baseline-category logit model, treating party affiliation as the response and political ideology as a quantitative explanatory variable. Interpret the political ideology effect for the choice
For a sample of people in Ithaca, New York, for the most recent time each person shopped for clothes, you plan to model the choice to shop downtown (the Ithaca Common), at the Pyramid/Triphammer mall, or on the Internet. Explanatory variables include annual income, whether a student, and distance
A baseline-category logit model fit predicting preference for U.S. President (Democrat, Republican, Independent)using x = annual income (in $10,000) is log(πˆD/πˆ I) = 3.3 − 0.2x and log(πˆR/πˆ I) = 1.0 + 0.3x.(a) For each equation, interpret the sign of the estimated effect of x.(b) Find
Explain why the cumulative logit model is not valid with a nominal response variable, but a baseline-category logit model is valid with an ordinal response variable.
Table 15.25 refers to passengers in autos and light trucks involved in accidents in the state of Maine. The table, available as the Accidents data file at the text website, classifies subjects by sex, location of accident, seat belt use, and a response variable having categories (1)not injured, (2)
Using software with Table 8.16, replicate the results shown in the previous exercise for the cumulative logit model. Indicate whether the sign for ˆβ agrees with the negative sign for ˆβ in Table 15.24, according to how your software parameterizes the model.
Consider Table 8.16 on page 233, treating happiness as the response variable. Table 15.24 shows results of fitting the cumulative logit model logit[P(y ≤ j)] =αj + βx, using scores (1, 2, 3) for income, and the chisquared test of independence.(a) Why does the table report two intercept
Table 15.23 refers to individuals who applied for admission into graduate school at the University of California in Berkeley. Data10 are presented for five of the six largest graduate departments at the university. The variables are A: Whether admitted (yes, no).S: Sex of applicant (male,
A sample of inmates being admitted to the Rhode Island Department of Corrections were asked whether they ever injected drugs and were tested for hepatitis C virus (HCV). The numbers who reported injecting drugs were 306 of the 887 men who tested HCV positive, 61 of the 3044 men who tested HCV
For Table 15.12 on page 481, Table 15.22 shows output for a logistic model treating marijuana use as the response variable and alcohol use and cigarette use as explanatory variables.(a) Set up dummy variables and report the prediction equation. Interpret the signs of the effects of alcohol use and
Table 15.21 summarizes logistic regression results from a study8 of how family transitions relate to first home purchase by young married households. The response variable is whether the subject owns a home (1 = yes, 0 = no). Explanatory variables include a categorical variable for marital status
Let P(y = 1) denote the probability that a randomly selected respondent supports current laws legalizing abortion, estimated using sex of respondent (s = 0, male;s = 1, female), religious affiliation (r1 = 1, Protestant, 0 otherwise; r2 = 1, Catholic, 0 otherwise; r1 = r2 = 0, Jewish), and
A multination study of whether a country transitioned from autocracy to democracy during the study period7 reported the prediction equation logit[ˆP(y = 1)] = −3.30 + 0.55t + 1.12(OECD)−1.16m − 0.01 f − 0.07g, where y = 1 if the nation made that transition, t = number of past transitions,
Table 12.1 in Chapter 12 reported GSS data on political ideology (scaled 1 to 7, with 1 being most liberal) by party affiliation of 1 2 3 4 5 6 7 Democrat 5 18 19 25 7 7 2 Republican 1 3 1 11 10 11 1 Use logistic regression to describe the effect of political ideology on the probability of being a
For first-degree murder convictions6 in East Baton Rouge Parish, Louisiana, between 1990 and 2008, the death penalty was given in 3 out of 25 cases in which a white killed a white, in 0 out of 3 cases in which a white killed a black, in 9 out of 30 cases in which a black killed a white, and in 11
Table 15.19, the data file Credit at the text website, shows data for a sample of 100 adults randomly selected for an Italian study on the relation between annual income and having a travel credit card, such as American Express or Diners Club. At each level of annual income(in thousands of euros),
A sample of 54 elderly men take a psychiatric examination to determine whether symptoms of senility are present. A subtest of the Wechsler Adult Intelligence Scale (WAIS) is the explanatory variable. The WAIS scores range from 4 to 20, with a mean of 11.6.Higher values indicate more effective
Refer to the previous exercise. When the explanatory variables are x1 = family income, x2 = number of years of education, and s = sex (1 = male, 0 = female), the prediction equation is logit[ˆP(y = 1)] = −2.40 + 0.02x1 + 0.08x2 + 0.20s.For this sample, x1 ranges from 6 to 157 with a standard
Forward selection is used with 10 potential explanatory variables for y. In reality, none are truly correlated with y or with each other. For a random sample, show that the probability equals 0.40 that at least one is entered into the regression model when the criterion for admission is a P-value
Show that using a cross-product term to model interaction assumes that the slope of the relationship between y and x1 changes linearly as x2 changes. How would you suggest modeling interaction if, instead, the slope of the linear relationship between y and x1 first increases as x2 changes from low
Select the best response for each of the following terms (not every response is used):Heteroscedasticity Multicollinearity Forward selection Interaction Exponential model Stepwise regression Studentized residual Generalized linear model(a) The mean of y multiplies by β for each unit increase in
Showing 3000 - 3100
of 5757
First
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Last
Step by Step Answers