Question: Use the given data HW 2 _ Data _ B for the following tasks: B 1 . ( 4 points ) Read and
Use the given data HWDataB for the following tasks:
B pointsRead and display the dataset provided. Determine the number of rows and columns present. Additionally, identify the columns containing missing data, list their names if any
B points From the given dataset using the python script identify the columns with categorical data. Furthermore, identify every column type. Indicate the type of consistency in the given dataset, if any. Convert the Id column from numerical to object type for ease of numeric operation such as normalization.
B pointsLook at the given dataset. Using python commands filter out negative values in the following two columns bmi and children Furthermore, some values are in decimal by mistake in age columns. Correct it using appropriate method. Also, find the unique categorical values and remove unknown values, if any note that Nan is not considered as unknown
B pointsDrop all columns containing or more missing values. Then impute the columns having missing values using median if the column is numerical and using mode if the column is categorical.
B pointsTransform the charges column such that the minimum value is and maximum value is Furthermore, transform the age column to have a mean of zero and a standard deviation of one. Print only the transformed columns.
B pointsDiscretize "bmi" column into the following four bins using only Pandas. Save it into another column as bmistatus.
bmi
Bin
Below
Underweight
Healthy Weight
Overweight
and Above
Obesity
B pointsConvert region using onehot encoder. The new name should start with regregnortheast Remove the original column.
B pointsFor the column smoker convert it yes to and no to The column name should remain unchanged.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
