Question: An analyst working for a large retailer is given dataset containing 1,000 customer data records for analysis. The analyst is tasked to predict if customers
An analyst working for a large retailer is given dataset containing 1,000 customer data records for analysis. The analyst is tasked to predict if customers are likely to churn (i.e., move to a competitor) or remain loyal (i.e. continue as customer of the retailer). The Data Dictionary for the dataset is as follows.
| Attribute | Description | Datatype |
| Customer ID | Unique identifier for each customer | Polynomial |
| Last Transaction Date | Most recent date on which the customer purchased an item from the retailer | Date |
| Discount | Percentage discount given at the time of purchase (a value from 0.0 to 1.0 indicating 0-100% respectively) | Real |
| Metro Region | Metropolitan region in which the customer is located (East, West, North, South) | Polynomial |
| Churn | Indicates if a customer has churned (Churn = Yes) or remains loyal (Churn = No). | Polynomial |
The analyst utilised RapidMiner to analyse the dataset and obtained the following decision tree and confusion matrix to classify customer churn. K-fold cross validation (with 10 folds) was used to evaluate and validate the model.
Accuracy: 81.80% +/- 3.85% (micro average: 81.80%)
| True No | True Yes | Class precision | |
| Predicted No | 618 | 89 | 87.41% |
| Predicted Yes | 93 | 200 | 68.26% |
| Class recall | 86.92% | 69.20% |
Based on the decision tree, explain what actions should be taken to prevent customers from churning? Write your answer for a senior business manager or similar non-technical reader.
Question 15 options:
| Question 16 (4 points) Interpret the results in the above confusion matrix. Briefly explain why there is a standard deviation of 3.85% on the accuracy evaluation of the model. Question 16 options:
|
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
