Question: R language In this assignment, download the train.csv from https://www.kaggle.com/code/gadigevishalsai/credit-score-classification-eda-classification/data (you do NOT need the test.csv file). 1Coding: (a) Read the .csv file into R.

R language

In this assignment, download the train.csv from

https://www.kaggle.com/code/gadigevishalsai/credit-score-classification-eda-classification/data (you do NOT need the test.csv file).

1Coding:

(a) Read the .csv file into R.

(b)Use the str() function to obtain variable types.

(c)Create a new object which should be a subset of this dataset. The subset only includes sample units with the Credit_Score variable equal to Poor or Good.

(d)Create a new object which includes 80% of the sample units from 5.1c, this 80% should be randomly sampled.

(e)For a non-numerical variable that you identified in 5.1b, count the number of unique values in this variable.

Answer the following questions:

(a)If you are to use all variables in this dataset to explain customers credit score (the Credit_Score variable), what are the variables that should NOT be included in analysis, why?

(b)Based on the output from 5.1b, which variable(s) have types that went against your expectation (e.g., you thought a variable is categorical, but based on str(), R had it as numerical)?

(c)From a modeling perspective, what is the potential consequence of treating a numerical variable as a categorical variable? What about treating a categorical variable as a numerical variable?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

CS 135 Assignment 10 Data Mining Trump The time is 2020 and more than ever the saying, \"Once it's on the internet, it's there forever\" holds true. This assignment will be a little different than...

Assignment 2 Due February 7 Overview In this assignment, you will store more information for the game board. Instead of just a money value, each grid cell on the board will have a tile with a genus,...

CSC108 Assignment 1: The Slide Game In Assignment 1, you write Python code that will be used by a game called The Slide Game. You can complete the whole assignment with only the concepts from Weeks...

Python and most Python libraries are free to download or use, though many users use Python through a paid service. Paid services help IT organizations manage the risks associated with the use of...

C++ Implement the hashtable and its functions into the project. Basically in the HashTable.cpp, I need to create the following functions: public: HashTable(); virtual ~HashTable(); void Insert(Bid...

The focus of these problems will be working with information extracted from a municipal government data feed containing bids submitted for auction of property. The data set is provided in two...

Consider the function L. sin 2 f(z) = 4 2 ) Plot f(x). 5) Sketch the Fourier series of f(x) and determine its Fourier coefficients. c) Sketch the Fourier sine series of f(x) and determine its Fourier...

1. From the list of accounts below, indicate whether each one is a) an estimated liability, b) a contingent liability, or c) an ordinary current liability. (5 marks) Warranty on $1,795,272 worth of...

What must be the price of a $ 5 comma 0 0 0 $ 5 , 0 0 0 bond with a 6 . 2 6 . 2 % coupon rate, semiannual coupons, and tenten years to maturity if it has a yield to maturity of 9 9 % APR?Question...

last two options for the multiple choice are : performance management development A construction equipment manufacturer, Roswell Corporation, is focusing on becoming a leader in sustainability in...

1. What types of companies are most likely to adopt cloud-based CRM software services? Why? What companies might not be well-suited for this type of software? Salesforce.com is the most successful...

8-13 What is a botnet? The IT sector is one of the key drivers of the European economy. It has been estimated that 60 percent of Europeans use the Internet regularly. Additionally, 87 percent own or...

8-15 Explain how a cyber attack can be carried out. The IT sector is one of the key drivers of the European economy. It has been estimated that 60 percent of Europeans use the Internet regularly....