4 5 LAB Using the DecisionTreeClassifier() on the iris data Write a program that splits a dataset into training and test set, builds a classification tree, and outputs a confusion matrix The program should do the following load the iris csv dataset create a dataframe, x, using the petal length and sepal length as features create a dataframe, y, using species split the data into training and test sets with 0 25 test size and random state 0 standardize x train and x test initialize the decision tree with criterion gini , random state 100, max depth 3, min samples leaf 5 run the decision tree on x test generate the confusion matrix The output should be 14 0 0 0 13 1 0 1 9 Coding Structure Hint loads the necessary libraries import pandas as pd from sklearn model selection import train test split from sklearn preprocessing import StandardScaler from sklearn tree import DecisionTreeClassifier from sklearn import datasets from sklearn import metrics load the iris dataset iris datasets load iris() x subset the data containing petal length and sepal length y subset the data containing the labels x train, x test, y train, y test splits the data into training and test sets for both x and y, with random state 0 standardize x train and x test cart initialize and run the decision tree using the following criterion gini , random state 100, max depth 3, min samples leaf 5 fit the x train and y train data y pred use the cart model to make predictions using x test conf give the confusion matrix using y test and y pred print(conf)

The Answer is in the image, click to view ...

Question: 4.5 LAB: Using the DecisionTreeClassifier() on the iris data Write a program that splits a dataset into training and test set, builds a classification tree,

4.5 LAB: Using the DecisionTreeClassifier() on the iris data

Write a program that splits a dataset into training and test set, builds a classification tree, and outputs a confusion matrix. The program should do the following:

load the iris.csv dataset
create a dataframe, x, using the petal_length and sepal_length as features
create a dataframe, y, using species
split the data into training and test sets with 0.25 test size and random_state = 0
standardize x_train and x_test
initialize the decision tree with criterion = "gini", random_state = 100, max_depth=3, min_samples_leaf=5
run the decision tree on x_test
generate the confusion matrix

The output should be:

[[14 0 0]

[ 0 13 1]

[ 0 1 9]]

Coding Structure Hint

# loads the necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from sklearn import datasets
from sklearn import metrics
# load the iris dataset
iris = datasets.load_iris()
x = # subset the data containing petal length and sepal length
y = # subset the data containing the labels
x_train, x_test, y_train, y_test = # splits the data into training and test sets for both x and y, with random_state = 0
# standardize x_train and x_test
cart = # initialize and run the decision tree using the following:
# criterion = "gini", random_state = 100, max_depth=3, min_samples_leaf=5
# fit the x_train and y_train data
y_pred = # use the cart model to make predictions using x_test
conf = # give the confusion matrix using y_test and y_pred
print(conf)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

4.5 LAB: Using the Decision TreeClassifier() on the iris data Write a program that splits a dataset into training and test set, builds a classification tree, and outputs a confusion matrix. The...

THE OUTPUT MUST BE EXACTLY THE SAME! Write a program that splits a dataset into training and test set, builds a classification tree, and outputs a confusion matrix. The program should do the...

Write a program that splits a dataset into training and test set, builds a classification tree, and outputs a confusion matrix. The program should do the following: load the iris.csv dataset create a...

R Project 0 2 : Data Frame / Data Preparation / Data Mining Please download the R file R Project 0 2 . R . Please download the data files: Pokeman Data.csv , Pokemon Info.csv , and Diabetes Data.csv...

use the code r Script below to Answer the questions from number 3 to 7 Questions : 3. Model #1 - First Logistic Regression Model Reporting Results Report the results of the regression model. Address...

PA5 Decision Trees (100 pts) Overview and Requirements----------------------- For this programming assignment, we are going to investigate the accuracy of our ID3 decision tree implementation...

Let's break it down step by step.Classification 1 . Conceptual Questions Difference between supervised and unsupervised learning: - Supervised learning involves training a model on a labeled dataset,...

Case 2 - LAB - Smart Party Ware BUAD425 -2 units DATA ANALYSIS FOR DECISION MAKING Information & Operations Management Marshall School of Business University of Southern California Scenario Applichem...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

You can use any software to plot and/or to calculate values/data, but if you do, provide (copy/paste) here the code. Data sets relevant for this HW can be found at the UCI Machine Learning...

For the beam and loading shown, (a) Draw the shear and bending moment diagrams, (b) Determine the maximum absolute values of the shear and bending moment. 6 kips 12 kips 4.5 kips of B. 2 ft 2 ft 2 ft...

Use the accompanying phase diagram for carbon to answer the following questions. a. How many triple points are in the phase diagram? b. What phases can coexist at each triple point? c. What happens...

An advantage of using standard costs is reduction of clerical costs. usefulness in setting prices for raw materials. useful for finding fault with performance. helpful to make employees utilize every...

5. Create an O(n) algorithm (comparisons) for the following problem. Given a list of 2n+1(n=1,2,3.)distinctpositiveintegers,partitionthelistinto two sublists, each of size n, such that the difference...

1. Are these goals consistent with your story/vision from the Practice segment in Developing and Communicating a Vision? \Why or why not? If not, what changes might you make to create better...

I What is it about you that makes you think you would be a good fit for us and that we would be a good fit for youi>

List at least one goal (but do not restrict yourself to one) that you have in each of the following four categories. You will refer to these goals throughout the discussions and activities in this...