Question: Exercise 2 : Before diving deeper into the data, we should stop and create a training and a test set. Since we are trying to

Exercise 2: Before diving deeper into the data, we should stop and create a training and a test set.
Since we are trying to predict whether an individual earns over $50K, save the income column as a Series named income_label.
Drop the income column from the census_income DataFrame and save the remaining columns as a DataFrame named income_features.
Utilize Scikit-learn's train_test_split function, employing the income_features and income_label variables, to partition the data into a training set and a test set. Allocate 80% of the instances for training and 20% for testing. Set the random_state to 42 to ensure reproducibility of our results. Assign the DataFrames the following names: X_train, X_test, y_train, and y_test.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!