Question: USE R Code Only DATA FILE URL: https://drive.google.com/file/d/1lP1j52IJJvyE4y13B8YVx6BXvDqiY9fD/view?usp=sharing This file contains datafrom a sample of 7043subscribers of telephone and/or Internet Services for a large telco.
USE R Code Only
DATA FILE URL: https://drive.google.com/file/d/1lP1j52IJJvyE4y13B8YVx6BXvDqiY9fD/view?usp=sharing
This file contains datafrom a sample of 7043subscribers of telephone and/or Internet Services for a large telco. We want to create three separate models to understand the predictors of churn of (i) subscribers of telephone services, (ii) subscribers of internet services, and (iii) peoplewho subscribe to both services. Analyze the data carefully (data definitions provided in the second worksheet of the Excel file). Submit your results in a nicely formatted Word or PDF file and your R code file.
1. Clean, process, and partition data as necessary, using appropriate R code.
1.5 Need to create table for all variables, which shows the corresponding alternative hypotheses, with a one-sentence rationale for each hypothesis. Be sure to include the right signs (positive or negative) for each hypothesis.
2. What predictors do you think contributes to the churn of (i) only telephone customers, (ii) only Internet service customers, and (iii) customers whosubscribe to both phone and Internet services? List reasoning for your answer. No points without reasoning.
3. Create training and test data sets with a 75:25 split using a random seed of 1024. Train logit models with the variables you identified in (b) and the training data. Combine the three model outputs using stargazer.
4. What are the top three predictors of churn of (i) only telephone customers, (ii) only Internet service customers, and (iii) customers whosubscribe to both phone and Internet services. Explain using marginal effects how much each predictor contributes to churn probability. (3 points)
5. Use TWO metrics to indicate which of these three models in Question 4 has best fit with the training data set and which model has the worst fit? How do you know?
6. Fit your models using test data, and compute recall, precision, F1-score, and AUC values for each of your three models. Which model worked best for your classification analysis?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
