Question: Use the credit default data located in the homework link (also in the Modules/datasets includes a CSV file CreditCarddefault_15K_spring2021.csv and description of the data CreditCardDefaultDescription.docx)
Use the credit default data located in the homework link (also in the Modules/datasets includes a CSV file CreditCarddefault_15K_spring2021.csv and description of the data CreditCardDefaultDescription.docx) to build and assess decision trees. A few requirements/hints:
- You will need to convert the file into a SAS Dataset first. It looks like this:
- When importing the dataset to your SAS Enterprise Miner project, verify/change the metadata so that it makes the best sense. For instance, ID should be nominal, Education should be ordinal, Pay_0, ..., Pay_6 should be ordinal, marriage should be nominal and default_payment_next_month should be binary and have a role of "Target."
- Remember to partition the data into a training set and a validation set, each account for 50% of the instances.
- Create a maximal tree and a pruned tree. Attach here the screenshots of the two trees. Also, answer: what is a maximal tree? What is a pruned tree? Brief describe the differences of the two trees such as the number of leaf nodes, what branches are replaced with a leaf node (not necessary to list every branch replaced; two examples are sufficient).
- Attach the two Subtree Assessment Plots (one for maximal tree one for pruned tree) here.
- Interpretation of the Subtree Assessment Plot:
- What does a subtree assessment plot show in general?
- Should we read the training or validation line? Why?
- How is the optimal pruned tree picked?
- What is the misclassification rate in the optimal pruned tree? What does the number mean?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
