Question: Considering the Census Income data with response variable as Income(>50k and

Considering the Census Income data with response variable as Income(>50k and <=50k).

  1. Import the csv dataset from https://www.kaggle.com/uciml/adult-census-income (Links to an external site.)
  2. Identify the presence of missing values, fill the missing values with mean for numerical attributes and mode value for categorical attributes.
  3. Visualize the dataset.
  4. Extract X as all columns except the Income column and Y as Income column.
  5. Split the data into training set and testing set.
  6. Model and train the classifier using GaussianNB and BernoulliNB.
  7. Compute the accuracy and confusion matrix for each classifier.
  8. Plot the decision boundary, visualize training and test results

Instructions

  1. Follow the instructions in each question carefully.
  2. Python code from Jupyter notebook along with output for each cell is expected.
  3. Any assignment submitted using other python IDEs are not considered for grading.
  4. Use appropriate labels for all visualizations.
  5. Upload the output.csv file along with the notebook when required.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!