Question: --- output: pdf_document: default html_document: default --- --- title: 'Universal Bank Personal Loan Acceptance' subtitle: ' BUA Assignment' author: - FirstName LastName date: `r format(Sys.time(),
--- output: pdf_document: default html_document: default ---
--- title: 'Universal Bank Personal Loan Acceptance' subtitle: ' BUA Assignment' author: - FirstName LastName date: "`r format(Sys.time(), '%d %B %Y')`" output: pdf_document ---
Please complete the following tasks and submit your work to earn bonus credits toward your final class grade.
*1. Import the UniversalBank dataset and remove the `PersonalLoan` attribute to create a new dataset `universalbank.clusting`.* ```{r, message=FALSE, warning=FALSE} # install.packages("dplyr") library(dplyr) universalbank<-read.csv("UniversalBank.csv") universalbank.clusting<-universalbank%>%select(-PersonalLoan) ```
*2. Using my solution file for Module 2: Home Equity Loan Customer Profiling assignment as a reference, standardize `Age`, `Experience`, `Income`, `CCAvg`, and `Mortgage`attributes in the dataset `universalbank.clusting` in the following chunk to create a new data set `universalbank.clusting.std`. Show all your R code in the submission.* ```{r, message=FALSE, warning=FALSE}
```
*3. In the following chunk, create the scatterplot matrix showing the relationships within different pairs of two attributes in `Age`, `Experience`, `Income`, `CCAvg`, and `Mortgage`. Show all your R code in the submission.* ```{r, message=FALSE, warning=FALSE}
``` *Problem: What is data relationship between Age and Experience?* **Your answer: ( )**
*4.In the following chunk, apply the elbow method to determine the optimal number of clusters, k, for conducting a clustering analysis on the `universalbank.clusting.std` dataset using the standardized attributes. Show all your R code in the submission.* ```{r, message=FALSE, warning=FALSE}
```
*5. Preform the clustering analysis based on the optimal k value in the following chunk. Show all your R code in the submission.* ```{r, message=FALSE, warning=FALSE}
```
*Problem: Based on the above R output, describe interesting features of some clusters produced from the analysis?* **Your answer: ( )**
*6. Split the `universalbank` datset into `training` and `test` sets using the stratified sampling method and `PersonalLoan` as the target attribute. Show all your R code in the submission.* ```{r, message=FALSE, warning=FALSE}
``` *7. Use my solution file for Module 3: Home Equity Loan Customer PreScreen Scoring assignment as a reference to build the "best" decision tree model to predict the target attribute `PersonalLoan` using the `training` and `test` subsets. Show all your R code in the submission. Also include the visualization of the model you build.*
*Note: Don't use `ID` as a predictor attribute.* ```{r, message=FALSE, warning=FALSE}
```
*8. In the following chunk, calculate the importance scores of different predictor attributes. Show all your R code in the submission.* ```{r message=FALSE, warning=FALSE}
```
*9. In the following chunk, build a logistic regression model to predict `PersonalLoan01` using the five most important predictors based on their importance scores. Show all your R code in the submission including that for addressing the multicollinearity problem. If the problem appears, you need to update your model to resolve the problem.* ```{r message=FALSE, warning=FALSE} universalbank<-universalbank%>%mutate(PersonalLoan01=as.numeric(PersonalLoan=="yes")) ```
*Problem: Based on the above model results, interpret the meaning of the regression coefficient estimate for the most important predictor.*
**Your answer:( )**
*10. In the following chunk, estimate the possible range of regression coefficient estimate for the most important predictor at 95% confidence level.* ```{r message=FALSE, warning=FALSE}
```
*Problem: Based on the prediction results, interpret the generalized meaning of the regression coefficient estimate for the most important predictor.*
**Your answer: ( )**
*11. In the following chunk, measure performance of the logistic regression model you build. Show all your R code in the submission* ```{r message=FALSE, warning=FALSE}
```
*Problem: According to the model performance measure in Task 11, is this model a poor/average/good/strong model?*
**Your answer:( )**
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
