Question: Task 2 : Import Train . csv into your Jupyter notebook. 2 . 1 Check the total number of observations and print a few records.

Task 2:
Import Train.csv into your Jupyter notebook.
2.1 Check the total number of observations and print a few records. Please note that the variable conversion in the raw data. Hint: Convert the relevant variables such as payment variables, Pay0-Pay6, and customer related variables (demographic) to categorical variables as appropriate.
2.2 Fit a logistic regression after making the dataset balanced. Hint: Use class weight parameter.
2.3 Remove the variable(s) that would cause multicollinearity. Explicitly state the variable(s) that you are dropping in a markdown cell in your Jupyter notebook. Hint: To remove a variable, use the drop function.
Import Test.csv into your Jupyter notebook.
2.4 Test the model on the test dataset. Please note that the variable conversion in the raw data. Hint: Convert the relevant variables such as payment variables, Pay0-Pay6, and customer related variables (demographic) to categorical variables as appropriate.
2.5 Plot the confusion matrix.
2.6 Provide your insights on accuracy, precision and F1 Score in a markdown cell in your Jupyter notebook.
ID A numerical value assigned to each credit card customer
LIMIT_BAL The remaining credit a customer can use
SEX 1= male ; 2= female
EDUCATION 1= graduate school ; 2= university ; 3= high school ; 4= others ; 5= unknown ; 6= unknown
MARRIAGE 0= unknown 1= married 2= single 3= others
AGE A customers age in years
PAY_0 Repayment status in September 2005: 0 or less: Paid duly ; 1 or greater = payment was delayed
PAY_2 Repayment status in August 20050 or less: Paid duly ; 1 or greater = payment was delayed
PAY_3 Repayment status in July 20050 or less: Paid duly ; 1 or greater = payment was delayed
PAY_4 Repayment status in June 20050 or less: Paid duly ; 1 or greater = payment was delayed
PAY_5 Repayment status in May 20050 or less: Paid duly ; 1 or greater = payment was delayed
PAY_6 Repayment status in April 20050 or less: Paid duly ; 1 or greater = payment was delayed
BILL_AMT1 The amount in the bill statement for September 2005 in NT dollar
BILL_AMT2 The amount in the bill statement for August 2005 in NT dollar
BILL_AMT3 The amount in the bill statement for July 2005 in NT dollar
BILL_AMT4 The amount in the bill statement for June 2005 in NT dollar
BILL_AMT5 The amount in the bill statement for May 2005 in NT dollar
BILL_AMT6 The amount in the bill statement for April 2005 in NT dollar
PAY_AMT1 The amount paid in NT dollar in September 2005
PAY_AMT2 The amount paid in NT dollar in August 2005
PAY_AMT3 The amount paid in NT dollar in July 2005
PAY_AMT4 The amount paid in NT dollar in June 2005
PAY_AMT5 The amount paid in NT dollar in May 2005
PAY_AMT6 The amount paid in NT dollar in April 2005
Default Shows customers who defaulted on their payments on the following month: 1= yes 0= no

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!