New Semester
Started
Get
50% OFF
Study Help!
--h --m --s
Claim Now
Question Answers
Textbooks
Find textbooks, questions and answers
Oops, something went wrong!
Change your search query and then try again
S
Books
FREE
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Tutors
Online Tutors
Find a Tutor
Hire a Tutor
Become a Tutor
AI Tutor
AI Study Planner
NEW
Sell Books
Search
Search
Sign In
Register
study help
business
statistical techniques in business
Statistical Models Theory And Practice 2nd Edition David A. Freedman - Solutions
This continues question 18; different notation is used: part (b) might be a little tricky. Garrett’s model includes a dummy variable for each of the 14 countries. The growth rate for country i in year t is modeled asαi + Zitγ + $it, where Zit is a 1×10 vector of explanatory variables,
Yule used a regression model to conclude that outrelief causes pauperism(section 1.4). He presented his paper at a meeting of the Royal Statistical Society on 21 March 1899. Sir Robert Giffen, Knight Commander of the Order of the Bath, was in the chair. There was a lively discussion, summarized in
There is a statistical model with a parameter θ. You need to estimate θ.Which is a better description of the bootstrap? Explain briefly.(i) The bootstrap will help you find an estimator for θ.(ii) Given an estimator θˆ for θ, the bootstrap will help you find the bias and SE of θˆ.
Which terms in equation (6) are observable, and which are unobservable?Which are parameters?
Does the model reflect the idea that energy consumption in 1975 might have been different from what it was? If so, how?
In table 1, at the end of column A, you will find the number 0.281. How is this number related to equation (6)?
To what extent are the one-step GLS estimates biased in this application?Which numbers in the table prove your point? How?
Are plug-in SEs biased in this application? Which numbers in the table prove your point? How?
Are bootstrap standard errors biased in this application? Which numbers in the table prove your point? How?
Paula has observed values on four independent random variables with common density fα,β (x) = c(α, β)(αx − β)2 exp[−(αx − β)2], whereα > 0, −∞
(Hard.) In example 3, if 1 ≤ i < n, show that E(i|X) = i .
In equation (1a), should a1 be positive or negative? What about a2, a3?
In equation (1b), should b1 be positive or negative? What about b2, b3?
In the butter model of this section:(a) Does the law of supply and demand hold true?(b) Is the supply curve concave? strictly concave?(c) Is the demand curve convex? strictly convex?(Economists prefer log linear specifications....)
An economist wants to use the butter model to determine how farmers will respond to price controls. Which of the following equations is the most relevant—(2a), (2b), (3a), (3b)? Explain briefly
By assumptions (i)-(ii), ZX is q×p of rank p, and ZZ is q×q of rank q. Show that:(a) ZZ is positive definite and invertible; the inverse has a square root.(b) XZ(ZZ)−1ZX is positive definite, hence invertible. Hint. Suppose c is p×1. Can cXZ(ZZ)−1ZXc ≤ 0?Note. Without assumptions
Let Ui be IID random variables. Let U = 1 nn i=1 Ui. True or false, and explain:(a) E(Ui) is the same for all i.(b) var(Ui) is the same for all i.(c) E(Ui) = U.(d) var(Ui) = 1 nn i=1 (Ui − U )2.(e) var(Ui) = 1 n−1 ni=1 (Ui − U )2.
An economist is specifying a model for the butter market in Illinois. She likes the model that we used for Wisconsin. She is willing to assume that the determinants of supply (wage rates and hay prices) are exogenous;also that the determinants of demand (prices of bread and olive oil) are
Let e = Y − XβˆIVLS be the residuals from IVLS. True or false, and explain:(a)i ei = 0.(b) e ⊥ X.(c) Y 2 = XβˆIVLS2 + e2.(d) σˆ 2 = e2/(n − p).
Which is smaller, Y − XβˆIVLS2 or Y − XβˆOLS2? Discuss briefly.
Is βˆIVLS biased or unbiased? What about σˆ 2 = Y −XβˆIVLS2/(n−p)as an estimator for σ2?
(Hard.) Verify that βˆIVLS = (ZX)−1ZY in the just-identified case(q = p). In particular, OLS is a special case of IVLS, with Z = X.
(Hard.) Pretend ZX is constant. To motivate definition (13), show that cov(βˆIVLS|Z) = σ2XZ(ZZ)−1ZX−1.
Using the data in table 2 of Schneider et al, estimate the probability that a respondent with the following characteristics will be a PTA member: (i) active chooser, (ii) lives in district 1, (iii) dissatisfied, (iv) child attends a school which has 300 students, (v) black, (vi) lived in district 1
Repeat, for a respondent who is not an active chooser but has otherwise the same characteristics as the respondent in exercise 1.
What is the difference between the numbers for the two respondents in exercises 1 and 2? How do Schneider et al interpret the difference?
Given the model, the numbers you have computed for the two respondents in exercises 1 and 2 are best interpreted as . Options:probabilities estimated probabilities estimated expected probabilities
What is it in the data that makes the coefficient of school size so close to 0? (For instance, would −0.3 be feasible?)
Do equations (1) and (2) in the paper state the model?
(a) Does table 1 in Schneider et al show the sample is representative or unrepresentative?(b) What percentage of the sample had incomes below $20,000?(c) Why isn’t there an income variable in table 2? table B1?(d) To what extent have Schneider et al stated the model? the statistical
A chance for bonus points. Three investigators are studying the following model: Yi = Xiβ + $i for i = 1,...,n. The random variables are all scalar, as is the unknown parameter β. The unobservable $i are IID with mean 0 and finite variance, but X is endogenous. Fortunately, the investigators also
Another chance for bonus points. Suppose that (Xi, Yi, Zi, $i) are independent four-tuples of scalar random variables for i = 1,...,n, with a common jointly normal distribution. All means are 0 and n is large.Suppose further that Yi = Xiβ + $i. The variables Xi, Yi, Zi are observable, and every
Last chance for bonus points. In the over-identified case, we could estimate σ2 by fitting (6) to the data, and dividing the sum of the squared residuals by q − p. What’s wrong with this idea?
An advertisement for a cancer treatment center starts with the headline“Celebrating Life with Cancer Survivors.” The text continues,“Did you know that there are more cancer survivors now than ever before? .... This means that life after a cancer diagnosis can be a reality.... we’re proud to
CT (computerized tomography) scans can detect lung cancer very early, while the disease is still localized and treatable by a surgeon—although the efficacy of treatment is unclear. Henschke et al (2006) found 484 lung cancers in a large-scale screening program, and estimated the 5-year survival
Pisano et al (2005) studied the “diagnostic performance of digital versus film mammography for breast cancer screening.” About 40,000 women participated in the trial; each subject was screened by both methods.“[The trial] did not measure mortality endpoints. The assumption inherent in the
Headlined “False Conviction Study Points to the Unreliability of Evidence,” the New York Times ran a story about the study, which“examined 200 cases in which innocent people served an average of 12 years in prison. A few types of unreliable trial evidence predictably supported wrongful
The New York Times ran a story headlined “Study Shows Marathons Aren’t Likely To KillYou,” claiming that the risk of dying on a marathon is twice as high if you drive it than if you run it. The underlying study(Redelmeier and Greenwald 2007) estimated risks for running marathons and for
Prostate cancer is the most common cancer among American men, with 200,000 new cases diagnosed each year. Patients will usually consult a urological surgeon, who recommends one of three treatment plans:surgical removal of the prostate, radiation that destroys the prostate, or watchful waiting (do
In 2004, as part of a program to monitor its first presidential election, 25 villages were selected at random in a certain area of Indonesia. In total, there were 25,000 registered voters in the sample villages, of whom 13,000 voted for Megawati: 13,000/25,000 = 0.52. True or false, and explain:
(Partly hypothetical.) Psychologists think that older people are happier, as are married people; moreover, happiness increases with income. To test the theory, a psychologist collects data on a sample of 1500 people, and fits a regression model:Happinessi = a + bUi + cVi + dWi + $i, with the usual
Yule ran a regression of changes in pauperism on changes in the out-relief ratio, with changes in population and changes in the population aged 65+as control variables. He used data from three censuses and four strata of unions, the small geographical areas that administered poor-law relief.He made
King, Keohane and Verba (1994) discuss the use of multiple regression to estimate causal effects in the social sciences. According to them,“Random error in an explanatory variable produces bias in the estimate of the relationship between the explanatory and the dependent variable. That bias takes
Ansolabehere and Konisky (2006) want to explain voter turnout Yi,t in county i and yeart. Let Xi,t be 1 if county i in yeart required registration before voting, else 0; let Zi,t be a 1×p vector of control variables. The authors consider two regression models. The first is(24) Yi,t = α + βXi,t +
An investigator fits a regression model Y = Xβ + $ to the data, and draws causal inferences from βˆ. A critic suggests that β may vary from one data point to another. According to a third party, the critique—even if correct—only means there is “unmodeled heterogeneity.”(a) Why would
A prominent social scientist describes the process of choosing a model specification as follows.“We begin with a specification that is suggested by prior theory and the question that is being addressed. Then we fit the model to the data. If this produces no useful results, we modify the
A is assumed going into the data analysis; a is estimated from the data analysis. Options:(i) response schedule (ii) regression equation
Causation follows from the ; estimated effects follow from fitting the to the data. Options:(i) response schedule (ii) regression equation
True or false: the causal effect of X on Y is demonstrated by doing something to the data with the computer. If true, what is the something?If false, what else might you need? Explain briefly.
What is the exogeneity assumption?
Suppose the exogeneity assumption holds. Can you use the data to show that a response schedule is false? Usually? Sometimes? Hardly ever?Explain briefly.
Suppose the exogeneity assumption holds. Can you use the data to show that a response schedule is true? Usually? Sometimes? Hardly ever?Explain briefly.
How would you answer questions 18 and 19 if the exogeneity assumption itself were doubtful?
Gilens (2001) proposes a logit model to explain the effect of general political knowledge on policy preferences. The equation reported in the paper is prob(Yi = 1) = α + βGi + Xiγ + Ui, where i indexes subjects; Yi = 1 if subject i favors a certain policy and Yi = 0 otherwise; Gi measures
Mamaros and Sacerdote (2006) look at variables determining volume of email. Their study population consists of students and recent graduates of Dartmouth; the study period year is one academic year. Let Yij be the number of emails exchanged between person i and person j , while Xi is a 1×p vector
Suppose Yi = a +bZi +cWi +$i, where the $i are IID with expectation 0 and variance σ2. However, Wi may be endogeneous. Assume that Zi = 0 or 1 has been assigned at random, so the Z’s are independent of the W’s and $’s. Let bˆ be the coefficient of Z when the equation is estimated by OLS.
Which of the following matrices are positive definite? non-negative definite?2 0 0 1 2 0 0 0 0 1 1 0 0 0 1 0 Hint: work out (u v)a b c d u v= (u v) a b c d u v.
Suppose X is an n×p matrix with rank p ≤ n.(a) Show that XX is p×p positive definite. Hint: if c is p×1, what is cXXc?(b) Show that XX is n×n non-negative definite.For exercises 3–6, suppose R is an n×n orthogonal matrix and D is an n×n diagonal matrix, with Dii > 0 for all i. Let G =
Show that Rx=x for any n×1 vector x.
Show that D and G are positive definite.
Let √D be the n×n matrix whose ij th element is Dij . Show that√D√D = D. Show also that R√DRR√DR = G.
Let D−1 be the matrix whose ij th element is 0 for i = j , while the iith element is 1/Dii. Show that D−1D = In×n and RD−1 RG = In×n.
Suppose G is positive definite. Show that—(a) G is invertible and G−1 is positive definite.(b) G has a positive definite square root G1/2.(c) G−1 has a positive definite square root G−1/2.
Let U be a random 3×1 vector. Show that cov(U ) is non-negative definite, and positive definite unless there is a 3×1 fixed (i.e., nonrandom) vector such that cU = cE(U ) with probability 1. Hints. Can you compute var(cU ) from cov(U )? If that hint isn’t enough, try the case E(U ) = 03×1.
Suppose G is n×n non-negative definite, and α is n×1.(a) Find an n×1 vector U of normal random variables with mean 0 and cov(U ) = G. Hint: let V be an n×1 vector of independent N (0, 1) variables, and let U = G1/2V .(b) How would you modify the construction to get E(U ) = α?
Suppose R is an orthogonal n×n matrix. If U is an n×1 vector of IID N (0, σ2) variables, show that RU is an n×1 vector of IID N (0, σ2)variables. Hint: what is E(RU )? cov(RU )? (“IID” is shorthand for“independent and identically distributed.”)
Suppose ξ and ζ are two random variables. If E(ξ ζ ) = E(ξ )E(ζ ), are ξ and ζ independent? What about the converse: if ξ and ζ are independent, is E(ξ ζ ) = E(ξ )E(ζ )?
If U and V are random variables, show that cov(U, V ) = cov(V , U )and var(U + V ) = var(U ) + var(V ) + 2cov(U, V ). Hint: what is[(U − α) + (V − β)]2?
Suppose ξ and ζ are jointly normal variables, with E(ξ ) = α, var(ξ ) =σ2, E(ζ ) = β, var(ζ ) = τ 2, and cov(ξ , ζ ) = ρστ . Find the mean and variance of ξ + ζ . Is ξ + ζ normal?
A coin is tossed 1000 times. Use the central limit theorem to approximate the chance of getting 475–525 heads (inclusive).
A box has red marbles and blue marbles. The fraction p of reds is unknown. 250 marbles are drawn at random with replacement, and 102 turn out to be red. Estimate p. Attach a standard error to your estimate.
Let pˆ be the estimator in exercise 7.(a) About how big is the difference between pˆ and p?(b) Can you find an approximate 95% confidence interval for p?
The “error function” ? is defined as follows:?(x) = 2√π x 0exp(−u2)du.Show that ? is the distribution function of |W|, where W is N (0, σ2).Find σ2. If Z is N (0, 1), how would you compute P (Z < x) from ??
If U,V are IID N (0, 1), show that (U + V )/√2, (U − V )/√2 are IID N (0, 1).
In the regression model of section 1, one of the following is always true and the other is usually false. Which is which, and why?(i) ⊥ X (ii) X
In the regression model of section 1, one of the following is always true and the other is usually false. Which is which, and why?(i) e ⊥ X (ii) e X
Does e ⊥ X help validate assumption (5)?
Suppose the first column of X is all 1’s, so the regression equation has an intercept.(a) Show that i ei = 0.(b) Does i ei = 0 help validate assumption (4)?(c) Is i i = 0? Or is i i around σ√n in size?
Show that (i) EX) = nσ2 and (ii) covX= EX=σ2In×n.
How is column 2 in table 2.1 related to the regression model for Hooke’s law? (Cross-references: table 2.1 is table 1 in chapter 2.)
Yule’s regression model (1.1) for pauperism can be translated into matrix notation: Y = Xβ + . We assume (3)-(4)-(5). For the metropolitan unions and the period 1871–81:(a) What are X and Y ? (Hint: look at table 1.3.)(b) What are the observed values of X41? X42? Y4?(c) Where do we look in
True or false: E(Yi|X) = Xiβ.
True or false: the sample mean of the Yi’s is Y = n−1 n i=1 Yi. Is Y a random variable?
True or false: var(Yi|X) = σ2.
True or false: the sample variance of the Yi’s is n−1 n i=1(Yi − Y )2. (If you prefer to divide by n − 1, that’s OK too.) Is this a random variable?
Conditionally onX, show that the joint distribution of the random vectors(βˆ − β,e) is the same for all values of β. Hint: express (βˆ − β,e) in terms of X and .
Can you put standard errors on the estimated coefficients inYule’s equation (1.2)? Explain briefly. Hint: see exercise A7.
In section 2.3, we estimated the intercept and slope for Hooke’s law.Can you put standard errors on these estimates? Explain briefly.
Here are two equations:(i) Y = Xβ + (ii) Y = Xβˆ + e Which is the regression model? Which equation has the parameters and which has the estimates? Which equation has the random errors? Which has the residuals?
We use the OLS estimator βˆ in the usual regression model, and the unbiased estimator of variance σˆ 2. Which of the following statements are true, and why?(i) cov(β) = σ2(XX)−1.(ii) cov(β)ˆ = σ2(XX)−1.(iii) cov(βˆ|X) = σ2(XX)−1.(iv) cov(βˆ|X) = ˆσ2(XX)−1.(v) cov
True or false, and explain.(a) If you fit a regression equation to data, the sum of the residuals is 0.(b) If the equation has an intercept, the sum of the residuals is 0.
True or false, and explain.(a) In the regression model, E(Yˆ|X) = Xβˆ.(b) In the regression model, E(Yˆ|X) = Xβ.(c) In the regression model, E(Y |X) = Xβ.
If X is n×n with rank n, show that X(XX)−1X = In×n, so Yˆ = Y .Hint: is X invertible?
Suppose there is an intercept in the regression model (1), so the first column of X is all 1’s. Let Y be the mean of Y . Let X be the mean of X, column by column. Show that Y = Xβˆ.
Let βˆ be the OLS estimator in (1), where the design matrix X has full rank p
(Hard.) Suppose Yi = a + bXi + i for i = 1,...,n, the i being IID with mean 0 and variance σ2, independent of the Xi. (Reminder: IID stands for “independent and identically distributed.”) Equation (2.5)expressed a,ˆ bˆ in terms of five summary statistics: two means, two SDs, and r. Derive
In the OLS regression model—(a) Is it the residuals that are independent from one subject to another, or the random errors?(b) Is it the residuals that are independent of the explanatory variables, or the random errors?(c) Is it the vector of residuals that is orthogonal to the column space of
In the OLS regression model, do the residuals always have mean 0?Discuss briefly.
True or false, and explain. If, after conditioning on X, the disturbance terms in a regression equation are correlated with each other across subjects, then—(a) the OLS estimates are likely to be biased;(b) the estimated standard errors are likely to be biased.
An OLS regression model is defined by e quation (2), with assumptions (4) and (5) on the ’s. Are the Yi independent? identically distributed?Discuss briefly.
Showing 3800 - 3900
of 5757
First
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
Last
Step by Step Answers