Question: DUE 2 0 MAY PLEASE GIVE CORRECT CODE WITH DETAILED EXPLANATION I WANT TO UNDERSTAND ALSO NOT JUST SEE A CODE The goal of this

DUE 20 MAY PLEASE GIVE CORRECT CODE WITH DETAILED EXPLANATION I WANT TO UNDERSTAND ALSO NOT JUST SEE A CODE
The goal of this homework assignment is to explore the KMeans algorithm using the given dataset airlines.csv. Throughout this assignment, you will perform various tasks including data description, data preprocessing, exploratory data analysis, and determining the optimal number of clusters using KMeans.
Tasks
1. Data Description
The provided raw data is in the airlines.csv file.
The description of the raw data is as follows:
id: Unique ID
balance: Number of miles eligible for award travel
qual_mile: Number of miles counted as qualifying for Topflight status.
cc1_miles: Number of miles earned with freq. flyer credit card in the past 12 months:
cc2_miles: Number of miles earned with Rewards credit card in the past 12 months:
cc3_miles: Number of miles earned with Small Business credit card in the past 12 months:
1: under 5,000
2: 5,000-10,000
3: 10,001-25,000
4: 25,001-50,000
5: over 50,000
bonus_miles: Number of miles earned from non-flight bonus transactions in the past 12 months.
bonus_trans: Number of non-flight bonus transactions in the past 12 months.
flight_miles_12mo: Number of flight miles in the past 12 months.
flight_trans_12: Number of flight transactions in the past 12 months.
days_since_enrolled: Number of days since enrolled in flier program.
award: whether that person had an award flight (free flight) or not.
2. Check for Missing Values
Perform data preprocessing to check for any missing values in the dataset.
3. Analyze Features
Create histograms to understand the distribution of different features in the dataset.
4. Calculate Percentage of Customers with/without Award
Find the percentage of customers who do not have an award flight and those who do have an award flight.
5. Correlation Analysis
- Find which feature is correlated with the balance feature.
- Draw a correlation heatmap to visualize the correlations among different features.
6. Plotting
Plot the relationship between frequent flying bonuses and non-flight bonus transactions.
7. Determining Optimal Number of Clusters
- Apply MinMaxScaler to normalize the data.
- Use the Elbow Method and Silhouette Score to find the ideal number of clusters for KMeans algorithm.
this is a short part from the information in the airline.csv file:
id,balance,qual_miles,cc1_miles,cc2_miles,cc3_miles,bonus_miles,bonus_trans,flight_miles_12mo,flight_trans_12,days_since_enroll,award
1,28143,0,1,1,1,174,1,0,0,7000,0
2,19244,0,1,1,1,215,2,0,0,6968,0
3,41354,0,1,1,1,4123,4,0,0,7034,0
4,14776,0,1,1,1,500,1,0,0,6952,0
5,97752,0,4,1,1,43300,26,2077,4,6935,1
6,16420,0,1,1,1,0,0,0,0,6942,0
7,84914,0,3,1,1,27482,25,0,0,6994,0
8,20856,0,1,1,1,5250,4,250,1,6938,1
9,443003,0,3,2,1,1753,43,3850,12,6948,1
10,104860,0,3,1,1,28426,28,1150,3,6931,1
the file has 4000 lines in total with similar information, I attached a photo also
 DUE 20 MAY PLEASE GIVE CORRECT CODE WITH DETAILED EXPLANATION I

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!