Question: 7. We talked in class about selection bias in the context of hypothesis tests, but it affects many other aspects as well. For example, the

7. We talked in class about selection bias in the context of hypothesis tests, but it affects many other aspects as well. For example, the least squares estimate, ?, of the error variance ois unbiased under the classical setting, but it turns out that bias is created when data is used to first select a set of predictors.' (a) Let n = 100 and p=10, and generate a nxp matrix X filled with independent standard normal random variables. The columns of X are your p = 10 available predictor variables, any or all of them can be used as part of a regression model. Suppose the true relationship between response and predictors is given by yi = Bo + Bixa + B2.12 + B31:3 +Ei, i = 1,...,n, that is, only the first three predictor variables matter. Generate a vector of response variables y from the above model, where e; ~ N(0,1), using Bo = 0, Bi = 1, B2 = 2, and Bs = 3. Fit this true model to your simulated data using least squares and extract the estimate ofrue of o?. (b) Select a subset of the 10 available predictor variables with lowest AIC. Fit the model with your selected set of predictor variables using least squares and extract the estimate one of o?. Compare the values of a true and c. (c) Write a loop that will repeat the above experiment 500 times. That is, for each experiment, you simulate the pair (x,y) exactly as described above and then you produce the pair of estimates (o true, uc). Draw a scatterplot of the 500 pairs of estimates. What is the relationship between ofrue and c? (d) What do you think are the implications of the relationship in Part (c)? 7. We talked in class about selection bias in the context of hypothesis tests, but it affects many other aspects as well. For example, the least squares estimate, ?, of the error variance ois unbiased under the classical setting, but it turns out that bias is created when data is used to first select a set of predictors.' (a) Let n = 100 and p=10, and generate a nxp matrix X filled with independent standard normal random variables. The columns of X are your p = 10 available predictor variables, any or all of them can be used as part of a regression model. Suppose the true relationship between response and predictors is given by yi = Bo + Bixa + B2.12 + B31:3 +Ei, i = 1,...,n, that is, only the first three predictor variables matter. Generate a vector of response variables y from the above model, where e; ~ N(0,1), using Bo = 0, Bi = 1, B2 = 2, and Bs = 3. Fit this true model to your simulated data using least squares and extract the estimate ofrue of o?. (b) Select a subset of the 10 available predictor variables with lowest AIC. Fit the model with your selected set of predictor variables using least squares and extract the estimate one of o?. Compare the values of a true and c. (c) Write a loop that will repeat the above experiment 500 times. That is, for each experiment, you simulate the pair (x,y) exactly as described above and then you produce the pair of estimates (o true, uc). Draw a scatterplot of the 500 pairs of estimates. What is the relationship between ofrue and c? (d) What do you think are the implications of the relationship in Part (c)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Finance Questions!

How should Powder River have recorded the sales of the working interests and the guaranteed payments made to the purchasers of those working interests in its accounting records?

Questions and resources are attached in the document. Assignment is regarding about Financial Accounting Theory. ACCT1080 Financial Accounting Theory Topic Allocations Topic of investigation...

34 Academy of Management Perspectives A R T I C November L E S The Management of Organizational Justice by Russell Cropanzano, David E. Bowen, and Stephen W. Gilliland Executive Overview...

Business Research Methodology- Question Bank 1 1. When the marketing department of an organization attempts to determine the amount of time the managers in this department spend at their computers in...

Hi, I need someone to do summary for the article I upload AUDITING: A JOURNAL OF PRACTICE & THEORY Vol. 28, No. 2 November 2009 pp. 1-34 American Accounting Association DOI: 10.2308 / aud.2009.28.2.1...

10. An experimenter has some degree of control over the: a. independent variable. b. correlative variable. c. history effect. d. All of the above, if the experiment is conducted properly. 11. If a...

SUMMARY OF LEARNING OBJECTIVES AND KEY POINTS 1. Identify the basic elements of organizations. Organizations are made up of a series of elements: Designing jobs Grouping jobs Establishing reporting...

Business Research MethodologyQuestion Bank 1 1. When the marketing department of an organization attempts to determine the amount of time the managers in this department spend at their computers in...

Catastrophic Events and Retroactive Liability Insurance: The Case of the MGM Grand Fire Stephen P. Baginski Richard B. Corbett William R. Ortega* This study examines the capital market response to...

What's more, many were able to stay large firms will dominate in this indus- cessful brand and product line. She may afloat for some time with little, or even try, and the myriad companies that have...

Final Exam MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The average age of the students in a statistics class is 22...

On November 1, 2020, Mia Corporation acquired on account goods from a foreign supplier at a cost of $3,000. The accounts payable are paid on February 14, 2021. On December 1, 2020, Mia Corporation...

any 1. Solve using (a) bisection method and (b) newton-raphson. Use initial guess. (10pts.) The ideal gas law is given by PV = RT, where P is the pressure, V is the specific volume, R is the...

In the swap market, which position potentially carries greater risks, broker or dealer? Question 2 Select one: A . Dealer B . They are the same swaps, therefore the same risks. C . Broker D . It is...

Smith Retail has the following purchases and sales for the period. What is the Cost of Goods Sold for the period using perpetual LIFO method? Response rounded to whole numbers, without commas and...