Question: Back Homework 7 - F 2 0 2 4 . pdf Problem 1 : ( 5 0 pts ) Use the neth.csv data file for

Back Homework 7-F2024.pdf
Problem 1: (50 pts)
Use the neth.csv data file for Problem 1. Your goal is to estimate a linear regression model of the number of weekly trips per household as a function of the remaining variables (to the extent possible). For your final model specifications, you should interpret each result. When creating indicator variables, feel free to be as creative as you want. Also feel free to be creative with your decision to log-transform any variables. Variable creation will be considered during grading. When fitting your linear regression model, you should be completing the following:
a) Get familiar with your data (e.g., what variables are included, what format do the variables take, are there variables that need to have indicators created, etc.).
b) Prepare data for modeling by creating indicator variables and/or log-transforming variables. You can use some creativity here, and are encouraged to. Explain why you chose to create the indicators you did.
c) Using a forward stepwise process, estimate your best fit linear regression model. Your Notebook file should show each of your steps. As you go through the process, you should also be considering the following:
Is heteroskedasticity a concern? Do you need to test for it? Can you estimate your model in such a way that you do not have to worry about heteroskedasticity?
Ensure the correlation among explanatory variables in your model meet required thresholds. You can use a correlation matrix or compute variance inflation factors after each step.
Do the signs of the parameters make sense? In other words, does having a negative or positive effect on y make sense (i.e., practical significance)?
d) Once you've arrived at your best fit model specifications, check the following assumptions:
Is the mean of the residuals zero? Compute the mean of the error and plot a distribution of the errors. What can you say based on the mean value and the distribution?
Plot the residuals vs. fitted values. What does this plot tell you? How did you account for it?
e) Plot the actual values vs. the fitted values? What does this plot tell you about your model?
f) Interpret all of the variables in your final model specifications and provide a brief plausible explanation for the effects you find.
g) Provide a summary that includes the logical process that led you to your final model specifications.
Variable definitions are given on the following page.
\table[[Variable,Definition],[hhsize,Household size],[ncar,Number of cars in household],[workers,Number of workers in household],[students,Number of students in household],[wktrips,Number of weekly trips per household],[child112,Number of children less than 12 years old in household],[city,Household residence in city (1 if yes, 0 otherwise)],[suburb,Household residence in suburb (1 if yes, 0 otherwise)],[rural,Household residence in rural area (1 if yes, 0 otherwise)],[income,Household income],[childg12,Number of children greater than or equal to 12 years old in household]]
Back Homework 7 - F 2 0 2 4 . pdf Problem 1 : ( 5

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Civil Engineering Questions!