Question: I need help with the last 3 parts. I do no know how use Label Encoding to convert all categorical features into numerical features Lab

 I need help with the last 3 parts. I do no

know how use Label Encoding to convert all categorical features into numerical

I need help with the last 3 parts. I do no know how use Label Encoding to convert all categorical features into numerical features

Lab 3: Data Preprocessing In this assignment, we will learn how to explore the raw data and preprocess it. The dataset we are going to exlore is an insurance data. It provides different features of each user as follows: age: age of the user sex: gender of the user bmi: body mass index, providing an understanding of body children: number of children covered by health insurance / number of dependents smoker: smoker or not region: the user's residential area in the US, northeast, southeast, southwest, northwest. Additionally, the medical cost of each user is also provided: . charges: the medical cost Please follow Lecture 5_data_understanding and Lecture 6_data_preprocessing to complete following questions. Q1. Load data with Pandas and output the basic information of this dataset, such as the features and their data types. Which features are numerical features and which users are categorical features? In [20]: your code Q2. Check whether there are missing values in this dataset. In [21]: #your code Q3. Visualize all numerical features with histogram plot to see the distribution of each numerical feature. In [22]: # your code Q4. Use corr() function of Pandas to show the correlation between different numerical features In [23]: your code Q5. For all categorical features, use bar plot to visualize the number of users within each category. In [24]: # your code Q6. Convert all categorical features into numerical features with Label Encoding or One-Hot Encoding In [25]: #your code Q7. Normalize all numerical features In [26]: your code Q8. Save your preprocessed data into a csv file. Submit your code and the preprocessed data. In (): #01. Load data with Pandas and output the basic information of this dataset, such as the features and their data types. data = pd.read_csv("insurance.csv") print("Basic Information of this dataset:") print(data.info() ) categorical_features = [X for x in data.columns if data[x].dtype "object"] numerical_features = [x for x in data.columns if data[x].dtype != "object"] print("Categorical features:") print(categorical_features) print("Numerical features:") print(numerical_features) #92. Check whether there are missing values in this dataset. print(data.isnull().any()) #03. Visualize all numerical featureswith histogram plot to see the distribution of each numerical feature. data[numerical_features].hist() plt.show() #04. Use Corr() function of pandas to show the correlation between different numerical features. print(data[numerical_features].corr( ) ) #05. For all categorical features, use bar plot to visualize the number of user within each category. for x in categorical_features: data[x].value_counts() .plot(kind ='bar') plt.show() #06. Convert all categorical features into numerical features with Label Encoding or One-Hot Encoding #07. Normalize all numerical features #08. Save your preprocessed data into a csv file. Submit your code and the preprocessed data

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!