Question: To perform k - NN classification and answer the questions, follow these steps in RStudio: # Load required libraries library ( caret ) library (

To perform k-NN classification and answer the questions, follow these steps in RStudio:
# Load required libraries
library(caret)
library(mlbench)
# Load the UniversalBank dataset
data("UniversalBank")
# Split the data into training and holdout sets (60% training, 40% holdout)
set.seed(123)
index <- createDataPartition(UniversalBank$PersonalLoan, p =0.6, list = FALSE)
train_data <- UniversalBank[index,]
holdout_data <- UniversalBank[-index, ]
# Define categorical predictors as factors
categorical_cols <- c("Family", "Education", "SecuritiesAccount", "CDAccount", "Online", "CreditCard")
train_data[categorical_cols]<- lapply(train_data[categorical_cols], as.factor)
holdout_data[categorical_cols]<- lapply(holdout_data[categorical_cols], as.factor)
# Define the new customer's data
new_customer <- data.frame(
Age =40,
Experience =10,
Income =84,
Family =2,
CCAvg =2,
Education =2,
Mortgage =0,
SecuritiesAccount =0,
CDAccount =0,
Online =1,
CreditCard =1
)
# Perform k-NN classification with k =1
knn_model <- train(
PersonalLoan ~ .,
data = train_data,
method ="knn",
preProcess = c("center", "scale"),
tuneGrid = expand.grid(k =1),
trControl = trainControl(method ="cv", number =5)
)
# Classify the new customer using the best k
predicted_class <- predict(knn_model, new_customer)
predicted_prob <- predict(knn_model, new_customer, type = "prob")
# Print the predicted class and probability
print(predicted_class)
print(predicted_prob)
# Confusion matrix for holdout data using the best k
best_k <- knn_model$bestTune$k
predicted_holdout <- predict(knn_model, holdout_data)
confusion_matrix <- confusionMatrix(predicted_holdout, holdout_data$PersonalLoan)
print(confusion_matrix)
Explanation:
The task involves using k-NN classification to predict whether customers will accept a personal loan offer based on demographic and banking information. Here are the steps:
Load the UniversalBank dataset and split it into training (60%) and holdout (40%) sets.
Define categorical predictors as factors for k-NN.
Define a new customer's data.
Perform k-NN classification with k =1 on the training data.
Classify the new customer using the best k obtained from 5-fold cross-validation.
Display the predicted class and probability for the new customer.
Calculate the confusion matrix for the holdout data using the best k.
This analysis helps determine if the new customer is likely to accept a personal loan offer based on their attributes and previous campaign

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!