Question: 4.5 LAB: Using the DecisionTreeClassifier() on the iris data Write a program that splits a dataset into training and test set, builds a classification tree,
4.5 LAB: Using the DecisionTreeClassifier() on the iris data
Write a program that splits a dataset into training and test set, builds a classification tree, and outputs a confusion matrix. The program should do the following:
- load the iris.csv dataset
- create a dataframe, x, using the petal_length and sepal_length as features
- create a dataframe, y, using species
- split the data into training and test sets with 0.25 test size and random_state = 0
- standardize x_train and x_test
- initialize the decision tree with criterion = "gini", random_state = 100, max_depth=3, min_samples_leaf=5
- run the decision tree on x_test
- generate the confusion matrix
The output should be:
[[14 0 0]
[ 0 13 1]
[ 0 1 9]]
Coding Structure Hint
- # loads the necessary libraries
- import pandas as pd
- from sklearn.model_selection import train_test_split
- from sklearn.preprocessing import StandardScaler
- from sklearn.tree import DecisionTreeClassifier
- from sklearn import datasets
- from sklearn import metrics
- # load the iris dataset
- iris = datasets.load_iris()
- x = # subset the data containing petal length and sepal length
- y = # subset the data containing the labels
- x_train, x_test, y_train, y_test = # splits the data into training and test sets for both x and y, with random_state = 0
- # standardize x_train and x_test
- cart = # initialize and run the decision tree using the following:
- # criterion = "gini", random_state = 100, max_depth=3, min_samples_leaf=5
- # fit the x_train and y_train data
- y_pred = # use the cart model to make predictions using x_test
- conf = # give the confusion matrix using y_test and y_pred
- print(conf)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
