Question: Check image 1 for question and image 2 for dataset structure: write a pyrhon code (a) Using the NumPy or Pandas package, load the dataset.
Check image 1 for question and image 2 for dataset structure:


(a) Using the NumPy or Pandas package, load the dataset. (Dataset "breast_cancer_wisconsin.csv" is uploaded for this assignment). Then split the dataset into train and test sets with a test ratio of 0.3. (b) Using the scikit-learn package, define a DT classifier with custom hyperparameters and fit it to your train set. Measure the precision, recall, F-score, and accuracy on both train and test sets. Also, plot the confusion matrices of the model on train and test sets. (c) Study how maximum tree depth and cost functions of the following can influence the efficiency of the Decision Tree on the delivered dataset. Describe your findings. i. three different cost functions: ['gini', 'ent ropy' , 'log_loss'] ii. six different maximum tree depth: [2,4,6,8,10,12] (d) Depict a plot of the decision boundary of the two mentioned hyperparameters. Comment on the fundamental features in short
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
