Question: Dummy code the Private dummy variable. (2 points) 2. Generate box-plots of the accept (Number of applications accepted) (2 points) and top10perc (% of new

Dummy code the "Private" dummy variable. (2 points) 2. Generate box-plots of the accept (Number of applications accepted) (2 points) and top10perc (% of new students from top 10% of High School class)) (2 points) attributes, enroll (Number of new students enrolled) (2 points) and identify the cutoff values for outliers. [ (4 points: remove outliers)] 3. Try to fit an MLR to this dataset, with ENROLL as the dependent variable. P_UNDERGRAD has somewhat longish tail, so we will take a log transform, (use LP_UNDERGRAD = log(P_UNDERGRADE)) and then use LP_UNDERGRADE as one of predictor (6 points) Keep the first 544 records as a training set (call it ENROLLTRAIN) which you will use to fit the model; the remaining 233 will be used as a test set (ENROLLTEST) (6 points) Use only the following variables in your model: ENROLL=ACCEPT + TOP10PERC + F_UNDERGRAD + LP_UNDERGRADE + ROOM_BOARD + GRADE_RATE + PRIVATEDUMMY (6 points) (a) Report the coefficients obtained by your model. Would you drop any of the variables used in your model (based on the t-scores or p-values)? (10 points) (b) Report the MSE obtained on ENROLLTRAIN. How much does this increase when you score your model on ENROLLTEST? (10 points) (c) Do you think your MLR model is reasonable for this problem? You may look at the distribution of residuals to provide an informed answer. (Bonus 2 points)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!