Question: Problem B [40 Marks]: Consider the data given in HW2_DataB Microsoft Excel (.csv) file and described in Table 1. Note: Solve all the following questions

 Problem B [40 Marks]: Consider the data given in "HW2_DataB" Microsoft

Excel (.csv) file and described in Table 1. Note: Solve all the

Problem B [40 Marks]: Consider the data given in "HW2_DataB" Microsoft Excel (.csv) file and described in Table 1. Note: Solve all the following questions using Python. Use the Pandas \& Sklearn library for all the following analyses. Using the given data do the following: B-1. [3 marks]: Read and display the data. Identify the number of rows and columns. Does any column have missing data? If yes, provide their name. B-2. [2 marks]: Type Consistency: For each column, identify each field type and verify that each column in Python is identified correctly. If there is any discrepancy, then indicate it. B-3. [5 marks]: Filter noise: Looking at the data, some values in the numeric columns ("age") were entered in a less than 1 (by mistake). Fix the inconsistencies. Furthermore, find unique categorical values and remove unknowns (if any). B-4. [7 marks]: Handling NaN values: Drop all columns containing 30% or more missing values. Then impute the columns having missing values. B-5. [5 marks]: Normalization/Transformation: Normalize all numeric columns to a mean of zero and standard deviation of one and print only normalized columns. B-6. [5 marks]: Encoding: Convert "work_type" using label encoder. B-7. [5 marks]: Encoding: For the "ever_married," convert it using binary values ( 0 and 1). Do not drop any new column(s). B-8. [8 marks]: General questions (write your answers in a jupyter notebook): (i) When is best to use a label encoder rather than one hot encoding? (ii) What are data cube aggregation and discretization? (iii) Give a real-world example of direct and indirect data acquisition approaches. (iv) Give a real-world example of structured data and unstructured data. (v) Why is there a need to convert numerical data to Min-Max scaler

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!