Question: Assimnt Questions: 1. Explore the data: What is the proportion of Good to Bad cases? Are there any missing values how do you handle these?


Assimnt Questions: 1. Explore the data: What is the proportion of \"Good" to \"Bad\" cases? Are there any missing values how do you handle these? Dbtain descriptions of the predictor (independent) variables mean, standard deviations, etc. for realvalues attributes, frequencies of different category values. Examine variable plots. Do you notice 'bad' credit cases to be more prevalent in certain value-ranges of specic variables= and is this what one might expect (or is it more of a surpri se}? What are certain interesting variables and relationships {why 'mteresting'}? From the data exploration, which variables do you thinlr will be most relevant for the outcome of interest, and why? We will rst focus on a descriptive model i-e. assume we are not interested in prediction- {a} Develop a decision tree on the full data. What decision tree node parameters do you uset get a good model (and why?) fb}Which variables are important to differentiate \"good" from \"bad\" cases and how do yer determine these? Does this match your expectations (om the your response in Question 13F {ch'hat levels of accuracy-"error are obtained? What is the accuracy on the \"good" and \"bad cases? l[Itbtain and interpret the lift chart. Do you thinlc this is a reliable (robust?) description, and why. We next consider developing a model for prediction For this, we should divide the data into Training and Validation sets- Consider a partition of the data into 543% for Training and Shari: for Test {a} Develop decision trees using the rpart package- What model performance do you obtain? Consider performance based on overall accm'acyr'error and on the ' good\" and 'bad' credit cases explain which performance measures, lilce recall, precision, sensitivity, etc. you use and why. Also consider lift, RUE and AUC. Is the model reliable [why or why not}? In developing the models above, change decision tree options as you nd reasonable {for example, complexity parameter (op), the minimum number of cases for split and at a leaf
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
