Question: You will be using Churn-Modeling.csv for this one. Here is a description of the variables. Some of the Variables are self-explanatory Variable Description Surname Last

You will be using Churn-Modeling.csv for this one. Here is a description of the variables. Some of the Variables are self-explanatory

Variable	Description
Surname	Last Name of the customer
Tenure	Number of Years the customer has been with the bank
Balance	Account Balance in USD
NumOfProducts	Number of Banks products to which the customer subscribed
HasCrCard	1 if the customer has a credit card with the bank 0 If the customer doesnt have a card with the bank
IsActiveMember	Whether customer made a transaction with the bank in the last 1 year
EstimatedSalary	The annual salary of the customer in USD
Exited	1 If the customer decided to quit the bank 0 If the customer decided to stay with the bank

Load and attach the Churn-Modeling.csv dataset into a variable mydata. Conduct an exploration of what variables it contains and what the values look like. How many rows does the dataset have? Convert the variables HasCrCard, IsActiveMember ,Gender and Exited into factors. What is the Surname of the customer with the highest EstimatedSalary? (1 Point)

Use CreditScore, Age and Balance to perform K-means clustering on the trees. You need to use the preProcess() and predict() functions in caret library to normalize the data in the range 0 to 1. Use K=3, 4 and 5. Which one is better (Use the ratio of inter cluster to intra cluster distance for this one)? (2 Points)

Plot the clusters using these variables for K=3, 4 and 5 i.e., three separate plots (Using CreditScore and Balance on X and Y axes and coloring them by the cluster number to which they belong). Which value of K do you think is better now? Use the Silhouette Method for this one (1.5 Points)

Use the table() command to tabulate the relationship between cluster assignment and the variable Exited for the best choice of K from the previous question. Do you see any clusters that contain predominantly Exited clusters. If so, describe the characteristics of customers in that cluster using the values of the variables used in clustering (Hint : Compare the centroid values of this cluster to the other clusters). (1.5 Points)

Use the variables Tenure, NumOfProducts and EstimatedSalary this time to run a k-means clustering algorithm (Use the scale() function instead of preprocess() and predict() this time). Run it for K=3,4, 5 and 6. Use the elbow method to determine the appropriate number of clusters this time (3 Points)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Perform an outlier analysis. Use the boxblot technique to present your results. Comment on the results and indicate your decision to leave or eliminate the identified outlier. HSRS HBAT Case...

Perform a missing value analysis of the results. Present the table with the results and comment on them. HSRS HBAT Case description Case Study: HBAT Industries' Background HBAT industries (HBAT), is...

I really need help with this program!!! Thanks Create a simple food delivery service soft ware system that manages customer call-in the orders for combo meals via a first in-first out method Your...

In java please.. 1. Write a project and package that contains definitions for the Pet class and the Procedure class (see below for specifications). 2. Write a project and package that unit tests the...

You must help one of your new franchise clients write a program for taking orders for their new startup ingredient delivery service. Your client needs help figuring out the total quantities of each...

Please I need help Immediately. Write a Java program for below assignment. Thank you so much. Part 1 - Class Specifications Class Studio Store in project/package: .bcs345.hwk.movies. business Member...

1. Import HBAT200 data into RStudio 2. Build an appropriate regression model that will explain "customer satisfaction" (Please note that X19-X23 are outcome variable, so do NOT use them as your...

Preview File Edit it View v Go Tools Window Help 5.75 GB 91% Mon 10:09 AM Q Joseph F., Jr. Hair, Mary F. Wolfinbarger, David J. Ortinau - Essentials of Marketing Research-McGraw-Hill (2016).pdf (page...

Bookbinders Book Club Introduction About 50,000 new titles, including new editions, are published in the United States each year, giving rise to a $20+ billion book publishing industry. About 10...

What are the investment proportions in the minimum-variance portfolio of the two risky funds, and what is the expected value and standard deviation of its rate of return? A pension fund manager is...

Find out the eligible amount for deduction towards Gross salary in the following situation for the assessment year 2015-16. (a) Mr. Sujeet Sanwlani is a manager in a company. The company deducted...

The statement cin > > num; is equivalent to the statement num > > cin True False

CT Corp Comprehensive Question Canadian Tire Corporation, Limited ( Canadian Tire ) is a family of companies that includes a retail segment and a financial services division, among others. The retail...

3. Explain the forces that influence how people handle conflict

2. Describe ways in which group size, social relationships, and communication networks affect group communication

1. List the characteristics and types of groups and explain how groups develop