Refer to the scenario described in Problem 13 and the file Cellphone. In XLMiner's Partition with Oversampling procedure, partition the data so there is 50 percent successes (churners) in the training set and 40 percent of the validation data is taken away as test data. Fit a classification tree using Churn as the output variable and all the other variables as input variables. In Step 2 of XLMiner's Classification Tree procedure, be sure to Normalize input data, and set the Minimum #records in a terminal node to 1. In Step 3 of XLMiner's Classification Tree procedure, set the maximum number of levels to 7. Generate the Full tree, Best pruned tree, and Minimum error tree. Generate lift charts for both the validation data and test data.
a. Why is partitioning with oversampling advised in this case?
b. Interpret the set of rules implied by the best pruned tree that characterize churners.
c. In the CT_Output1 sheet, why is the overall error rate of the full tree 0 percent? Explain why this is not necessarily an indication that the full tree should be used to classify future observations and the role of the best pruned tree.
d. For the default cutoff value of 0.5, what are the overall error rate, Class 1 error rate, and Class 0 error rate of the best pruned tree on the test data?
e. Examine the decile-wise lift chart for the best pruned tree on the test data. What is the first decile lift? Interpret this value