Question: using rstudio 4. Using the Boston data set of library MASS, fit classification models in order to predict whether a given suburb has a house

using rstudio 4. Using the Boston data set of library MASS,

using rstudio

4. Using the Boston data set of library MASS, fit classification models in order to predict whether a given suburb has a house value (medu) above or below the median (median(medu)). In particular, explore various KNN models. (a) Create a medv01 variable, where __fo, medu Smedian(medv), 1, medu > median(medu) Add it as a column in Boston data frame, while disposing of the original medu variable (via Boston$medu = NULL). (b) Decide on three values of K and three subsets of predictors including full set of all 13 predictors). Totally your judgment call. When picking predictor subsets, you could use Your own logical considerations (which variables appear most important to predict the price), Correlation matrix (to determine if there are groups of correlated variables, and you could just retain one of them in the model, dropping the rest). (c) What stable method was introduced in the class in order to compare predictive quality of different models? Proceed to use this method & obtain test error estimates of all 3 x 3 = 9 models. Which model (the combination of K & predictor subset) won? (d) Are all the variables in Boston data set on the same scale? If not, how do we deal with it? (e) Proceed to apply scaling to the predictors in Boston data, and repeat part (c) for the standardized data set. Did the results (the test errors & the winning model) change? 4. Using the Boston data set of library MASS, fit classification models in order to predict whether a given suburb has a house value (medu) above or below the median (median(medu)). In particular, explore various KNN models. (a) Create a medv01 variable, where __fo, medu Smedian(medv), 1, medu > median(medu) Add it as a column in Boston data frame, while disposing of the original medu variable (via Boston$medu = NULL). (b) Decide on three values of K and three subsets of predictors including full set of all 13 predictors). Totally your judgment call. When picking predictor subsets, you could use Your own logical considerations (which variables appear most important to predict the price), Correlation matrix (to determine if there are groups of correlated variables, and you could just retain one of them in the model, dropping the rest). (c) What stable method was introduced in the class in order to compare predictive quality of different models? Proceed to use this method & obtain test error estimates of all 3 x 3 = 9 models. Which model (the combination of K & predictor subset) won? (d) Are all the variables in Boston data set on the same scale? If not, how do we deal with it? (e) Proceed to apply scaling to the predictors in Boston data, and repeat part (c) for the standardized data set. Did the results (the test errors & the winning model) change

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Specification and Verification II Consider the following Verilog phrases: initial r = 0; always @(posedge clk) r = a + r; Write down a formula in logic that relates clk, a and r at a level of...

A large department store chain has 40 branches throughout the UK. A senior manager for the company is responsible for the sales of female cosmetics and perfumes. She was concerned that in the run up...

3 COLLEGE ALGEBRA - TRIGONOMETRY Business and Finance (MAT115) This course will start with a review of basic algebra (factoring, solving linear equations, and equalities, etc.) and proceed to a study...

This question involves the use of AGGREGATE linear PYTHOIN regression on the Auto data set. (a) Perform a simple linear regression with mpg as the response and horsepower as the predictor. Describe...

Abstract This article describes CRISP-DM (Cross-Industry Sandand Process for Data Mining), a non-proprietary, documented, and freely available data mining model. Dezeloped by indias- try leaders...

Describe the types of cybercrimes facing organizations and critical infrastructures, explain the motives of cybercriminals, and evaluate the financial Explain both low-tech and high-tech methods...

Create charts to better understand data sets. For cross-sectional data, use a scatter chart. For time series data, use a line chart. Linear y = a + bx Logarithmic y = ln(x) Polynomial (2nd order) y =...

Please Read the article by clicking the link below then Read the article "Using Binary Logistic Regression to Investigate High Employee Turnover" available under Content forModule 6 (The 2 power...

Jupiter Notebook We have covered some of the limitations of single layer neural networks in class, but they are still powerful learning systems that provide a good way to begin learning about how to...

In Exercises 1 - 16, plot the point given in polar coordinates and then give three different expressions for the point such that (a) r 0 and 0 (c) r > 0 and 2 9. (20, 3) In...

On April 1,Snell Company made a $50,000 sale giving the customer terms of 3/10/n30.The receivable was collected from the customer on April 8.Considering the collection of cash from the receivable,...

McGregor, Inc. will elect eight board members next month. The company has 50,000 shares outstanding and you have 6,000 of these shares. a. What percentage of stock do you need to elect a director...

9. Which of the following affects both the thinking distance and the braking distance of a vehicle? (a) poor brakes (b) speed (c) new tyres (d) wet roads

Currently, afumas an EPS of $3.75 and a benchmark PE 16.01. Larings are expected to grow by 38 percentual What is need to be calculations and round your answer to 2 decimal places... 3216)

9. Describe the characteristics of power.

5. The month of Ramadan, a month of fasting for Muslims, ends with which holiday? a. Eid ul-Fitr b. Allahu Akbar c. Takbir d. Abu Bakr

2. What is the name of the dish that features black-eyed peas and rice (although sometimes collards, ham hocks, stewed tomatoes, or other items) and is served in the South, especially on New Years...