Question: Some variables used in a multiple regression model are, in fact, not associated with the response. Including such irrelevant variables leads to unnecessary complexity in
Some variables used in a multiple regression model are, in fact, not associated with the response. Including such irrelevant variables leads to unnecessary complexity in the resulting model. There are major approaches to mitigate high-dimensionality by removing these variables, among which, Best Subset Selection is used for variable selection. Please explain why we use adjusted R2 instead of using R2 statistics to assess the performance of the models trained on different combinations of variables. R2 = 1- / Adjusted R2 = 1- /(--) //(-)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
