Question: 4. With the given dataset in a csv file, named titanic.csv: The data file contains data for 892 of the real Titanic passengers. Each row

4. With the given dataset in a csv file, named titanic.csv: The data file contains data for 892 of the real Titanic passengers. Each row represents one person. The columns describe different attributes about the person including whether they survived (S), their age (A), their passenger-class (C), their sex (G) and the fare they paid (X). Please finish the following tasks: 1. Load the data, covert your data to pandas DataFrame. 2. Clean your data (Please only consider the missing values). 3. Encode your data (hint: male -> 0, female -> 1) 4. Split the data into training and testing sets. 5. Use Logistic() to train your model. 6. Use your trained model to classify for test dataset. 7. Evaluate the performance of the model. 8. Print the equation of your regression 9. Interpret your evaluation.

5. Continue with the Question 4 (copy and create a new file), finish the following task: 1. Chose DecisionTree() to train your model, and then repeat the question 4 step 6-9. (Step 8: print the textual tree) 2. Extra credit: Plot the tree. 3. Compare the performance of the result vs. the performance of Question 4. Interpret it. 4. Extra credit: Prune the tree, find the tree that balance between complexity to the performance.

6. Continue with the Question 5 (copy and create a new file), finish the following task: 1. Choose RandomForest() to train your model, and then repeat the question 4 step 6-9. (Skip step 8) 2. Compare the performance of the result vs. the performance of Question 5. Interpret it.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!