Question: need help with this question. thank you This question will discuss the consequences of non-representative sampling (recall the Dewey versus Truman example , and how
need help with this question. thank you

This question will discuss the consequences of non-representative sampling (recall the Dewey versus Truman example , and how we can correct for it in certain cases by reweighting the sample population. Suppose each individual in the population of interest is characterized by an outcome of interest Y, (e.g. did you vote for Truman), some observable characteristics X; (e.g. income), and an indicator for whether they are part of the surveyed population S; (e.g. do you have a phone in 1948). We are interested in E[Y ], but even if we took an infinitely large survey, we would only directly measure E[Y:|S, = 1], since people with S, = 0 are never surveyed. (a) Suppose that S; I Y. Argue that E[Yi|S. = 1] = E[Yi]. Why might independence of Yi and S; be a bad assumption, for example in the Dewey versus Truman setting? (b) An alternative assumption is that S; 1 Y,|X,. Describe, intuitively, what this assumption means in the context of Dewey versus Truman (1-2 sentences). In the Dewey versus Truman example, why might this assumption be more plausible than full independence? (c) Suppose that S; I Y:|X;. For simplicity, assume that E[Y,|X, = = 3x. Derive an expression for the bias from the sampling design, bias = E[Yi|S, = 1] - E[Yi], as a function of B and E[Xi|S; = 1] and E[X ]. Describe in words what your answer tells us about when the bias will be large /small in the Dewey versus Truman example
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
