Generate data where selection bias is present using the following code. set.seed(1) n = 1000 error =
Question:
Generate data where selection bias is present using the following code. set.seed(1) n = 1000 error = rnorm(n) x = rnorm(n) t = ifelse(error+x<0,1,0) y = 2*t + x + error Here n is the sample size, x is a predictor variable, t is a binary variable that indicates the presence of a treatment, and y is a continuous dependent variable. (a) What is the true value of the treatment effect parameter used to generate your data? (b) In your data, does selection bias overestimate or underestimate the true treatment effect? (c) Explain your answer to part (b) by examining the relationship between the treatment variable and the error term. (Hint: plot t against the error term.) (d) Which assumption about regression models is being violated here? (e) Is there a difference in the average value of x between the treatment and control