Design a pipeline(i.e., a standard procedure on preliminary data analysis, selection of ML methods,training, and test data
Question:
Design a pipeline (i.e., a standard procedure on preliminary data analysis, selection of ML methods, training, and test data splitting, metrics selection, and evaluation) to use ML to solve a predefined task and write a report on it.
Machine LearningPipeline
Design a pipeline, an evaluation strategy, and a set of experiments to determine the best parameters and machine learning algorithm, based on the results of empirical evaluations derived from a dataset (for achieving this, you could compare different algorithms if needed). Data used for this task: bank churning dataset
A business manager of a consumer credit card portfolio is facing the problem of customer attrition. They want to analyze the data to find out why and leverage the same to predict customers who are likely to drop off.
Multiple attributes have been provided about the customer's information along with demographics.
Features:
- Client number.Unique identifier for the customer holding the account.
- Demographic variable- Customer's Age in Years.
- Demographic variable- Gender M=Male, F=Female.
- Demographic variable- Number of dependents.
- Demographic variable- Educational Qualification of the accountholder (example: high school, college graduate, etc.).
- Demographic variable- Marital_Status: Married, Single, Divorced, Unknown.
- Demographic variable - AnnualIncome Category of the accountholder (< $40K,
$40K - 60K, $60K - $80K, $80K-$120K, > $120K, Unknown).
- Product Variable- Type of Card (Blue, Silver, Gold, Platinum).
- Month on book - Period of relationship with the bank.
- Total relationship count Total no. of products held by the customer.
- Months inactive: No. of months inactive in the last 12 months
- No. of Contacts in the last 12 months.
- Credit Limit on the Credit Card.
- Total Revolving Balance on the Credit Card.
- Open to Buy Credit Line (Average of last 12 months).
- Change in Transaction Amount(Q4 over Q1).
- Total Transaction Amount (Last 12 months).
- Total Transaction Count (Last 12 months).
- Change in Transaction Count(Q4 over Q1)
- Average Card Utilization Ratio.
- Attrition Flag. Internal event (customer activity) variable - if the accountis closed then 1 else 0.
Inspiration
Predict whether a customer is likely to cancel his account.
Engineering Economy
ISBN: 978-0133439274
16th edition
Authors: William G. Sullivan, Elin M. Wicks, C. Patrick Koelling