Question: This activity aims to use the Decision Tree algorithm for pima - indians - diabetes dataset. Instructions: Create a Decision Tree, train it to fit

This activity aims to use the Decision Tree algorithm for

pima

-

indians

-

diabetes

dataset.

Instructions:

Create a Decision Tree, train it to fit

75 %

of the data, and test your model with the remaining

25 % .

The following tasks can be performed:

Import necessary tools

Data exploration and preparation

Creating and training the model

The hyperparameter splitting criteria for categorical outputs are:

gini

entropy

.

Entropy corresponds to the ID

3

algorithm. The gini is the one used in the cart algorithm. Nevertheless, it is interesting to experiment with both splitting criteria.

The hyperparameter

max

_

depth

can be varied to handle the overfitting. It fixes the depth of the tree.

The hyperparameter

min

_

samples

_

split

also can be varied; it corresponds to the minimum number of samples under which the node can be spitted.

Tree visualization

(

you can visualize the tree for different hyperparameters

)

Testing and measuring performance

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Factors encouraging cycle commuting in Scotland Big Data Fundamentals Coursework Daniel Devine CS982 Big Data Technologies Computer and Information Sciences University of Strathclyde, Glasgow 5th...

Please provide the summary of the methodology and your understanding of this paper. Incluse necessary figures as well. Rapid Object Detection using a Boosted Cascade of Simple Features single feature...

Decision Trees ( DTs ) are a non - parametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target variable by learning...

Decision trees can be constructed in R using C5.0 algorithm - an improvement of the C4.5 algorithm. To prepare for this Assignment: Go to the UCI Machine Learning Repository, which hosts free...

Classification Task ( predicting a class for a new input ) ( 5 0 % - 1 0 0 marks ) For this task, you will find a dataset and classify a set of records to a specific target using 3 different types of...

2 . Classification Task ( predicting a class for a new input ) ( 5 0 % - 1 0 0 marks ) For this task, you will find a dataset and classify a set of records to a specific target using 3 different...

machine learning(Python) Write a Python script to generate the best fit model based on your dataset (use any dataset), and apply Support Vector Machine (SVM) / Decision Tree (DT) algorithms. Then,...

2.23 Cost of capital (40%) Calculate the company?s cost of equity using various methods (at least 2); showing your calculations and detailing the source of your data. Calculate the company?s cost of...

Just one part of the assignment, Thank you! 2.22 Risk-return analysis (40%) What are the risks of this company? (Where is this risk coming from (market, firm, industry or currency)? How is the risk...

s1 educated (SSE) student for every three public school educated (PSE) students. Reasoning that students are not very dissimilar from threads, he suggests the following entry and exit routines be...

select a company of your choice and a brief history of what they do and tell the segment they are targeting and product mix, the basis they use to segment.

Lori is a student who teaches golf on the weekend and in a year earns $20,000 after paying her taxes. At the beginning of 2010, Lori owned $1,000 worth of books, CDs, and golf clubs and she had...

explain how a flexible budget differs from a regular budget by outlining how the amounts of flexible budget are calculated

Approximately 7.8% of all (untreated) Jonathan apples had bitter pit in a study conducted by the botanists Ratkowsky and Martin. (Bitter pit is a disease of apples resulting in a soggy ore, which can...