Question: In this question, we work with another dataset from the textbook of An Introduction to Statistical Learning. (A) Read the dataset file Credit.csv, and assign

In this question, we work with another dataset from the textbook of "An Introduction to Statistical Learning." (A) Read the dataset file Credit.csv, and assign it to a Pandas DataFrame. (B) Check out the dataset. The Credit dataset includes balance column (average credit card debt for a number of individuals) as target, as well as several features: age, cards (number of credit cards), education (years of education), income (in thousands of dollars), limit (credit limit), marital status, and rating (credit rating). (C) Generate the feature matrix and target vector (target is balance in this dataset). Then, normalize (scale) the features (note: dont normalize the target vector!). To normalize the data, you can simply use preprocessing.scale(X) from sklearn. (D) Split the dataset into testing and training sets with the following parameters: test_size=0.24, random_state=4. (E) Use Linear Regression to train a linear model on the training set. Check the coefficients of the linear regression model. Which feature is the most important? Which feature is the least important? (F) Predict balance for the users in testing set. Then, compare the predicted balance with the actual balance by calculating and reporting the RMSE (as we saw in lab tutorial 4). (G) Now, use 10-fold Cross-Validation to evaluate the performance of a linear regression in predicting the balance. Thus, rather than splitting the dataset into testing and training, use Cross-Validation to evaluate the regression performance. What is the RMSE when you use cross-validation?

see the Credit.csv file in the following link

https://drive.google.com/file/d/1uOfDxpwbnYs2JZpB0TUSyG2HQ8H6I9mL/view

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

In this question, we work with a dataset from the great textbook of "An Introduction to Statistical Learning." (A) Read the dataset file Hearts_s.csv and assign it to a Pandas DataFrame. (B) Check...

Please include python screenshots. Thank you! Safari File Edit View History Bookmarks Develop Window Help ()) 53% Fri 5:44:52 PM Q E ... datahub.ucsd.edu ABP Course Hero A5-Experimentation - Jupyter...

Overview and Requirements For this programming assignment, we are going to implement the k-means clustering algorithm in Jupyter Notebook. Cluster analysis seeks to separate objects into groups (or...

I need to see the SPSS output. You need to have all z-scores, all charts, all descriptives data from SPSS, everything you used to answer the questions. I am sending you what the previous tutor sent...

sklearn jupyther show examples of the steps because cant post data tables Question2:predict the probability of Heart Disease Write and submit your python codes in "Jupyter Notebook" to perform the...

can you please help me with this assignment Crete a narrated multimedia presentation using either Power Point, Screencast-o-matic, or Prezi. Remember,narration with audio (not just ppt notes) is...

Exploratory Data Analysis Introduction This chapter will show you how to use visualisation and transformation to explore your data in a systematic way, a task that statisticians call exploratory data...

Write Python code to solve this homework in detail with comments. eg of csv file contain: AREA Description AGR The course aims to introduce Rules and Regulations that are designated for undergraduate...

\fThis is an electronic version of the print textbook. Due to electronic rights restrictions, some third party content may be suppressed. Editorial review has deemed that any suppressed content does...

MBA 5652, Research Methods Course Syllabus Course Description Business research methods will guide students in advancing their knowledge of different research principles and their applicability in...

The express check-out lanes at Wallys Supermarket are limited to customers purchasing 12 or fewer items. Cashiers at this supermarket have complained that many customers who use the express lanes...

Read and summarize Announcement 2010-28.

Which of the following statements are correct? I ) According to the Liquidity Preference Theory, if the yield curve is upward sloping, expectations of short - term rates in the future can either be...

A particular first-order reaction has a rate constant of 91.3s1 at 25.0C. What is the rate constant at 62.7C if Ea=23.9kJ/mol ? Enter your answer with at least 3 sig figs. Question 12 1pts The...

Holding national saving constant, does an increase in net capital outflow increase, decrease, or have no effect on a countrys accumulation of domestic capital?

An article in USA Today (December 16, 2004) began President Bush said Wednesday that the White House will shore up the sliding dollar by working to cut record budget and trade deficits. a. According...

International trade in each of the following products has increased over time. Suggest some reasons this might be so. a. wheat b. banking services c. computer software d. automobiles