Question: A - 3 . [ 1 0 marks: 2 . 5 each ] : a . Split the dataset into training and testing sets using

- 3 . [10

marks:

2.5

each

]

.

Split the dataset into training and testing sets using train

_

test

_

split function with

75 %

for

training and

25 %

for training using random state

= 10 .

.

Build a decision tree classifier for predicting the class label. Fit the classifier using the

training dataset. Set random state to

100,

criterion to entropy, and splitter to best.

.

Draw the decision tree using scikit

-

learn

(

sklearn

)

.

Test the classifier on the testing data set, and print the confusion matrix and classification

metrics

(

Accuracy

,

sensitivity

(

Recall

),

Precision

)

of the decision tree classifier.

- 4 . [10

marks:

2.5

each : Using the same dataset split in A

- 3 .

Page

2

9

ISE

- 291

: Homework

04

.

Build a Random forest classifier for predicting the class label with

4

trees. Fit the classifier

using the training set. Set criterion to entropy and random

_

state to

62 .

.

Draw the trees using sci

-

kit learn

(

sklearn

)

.

Test the classifier on the testing data set, and print the confusion matrix and classification

metrics

(

Accuracy

,

sensitivity

(

Recall

),

Precision

)

of the Random forest classifier.

.

Repeat A

- 4 (

-

)

using a Random forest with

8

trees instead of

4 .

- 5 . [10

marks

]

: Calculate the Information Gain

(

)

for the class variable "Drug" given the feature

selected

"

"

as a root node.

- 6 . [10

marks

]

: From the decision tree built in A

- 3,

write three classification rules using the

normalized values first then return it to the original values.

- 7 . [10

marks

]

: Write an association rule for

"

- >

Cholestrol", which rule has the highest

accuracy? Write the corresponding support and accuracy.

- 8 . [10

marks

]

: Repeat parts b

,

,

and d in A

- 3

using the Na

ve Bayes GaussianNB classifier.

- 9 .

Compare the performance of the Na

ve Bayes against the built decision tree and random forest

classifiers using confusion matrix. Based on the comparison, which one is the best to use with

the given datat set? Problem A

[100

Marks

]

: Solve all the questions using Python. Use Pandas, Seaborn, Sklearn, etc.,

libraries for all the analysis. Consider the data given in Excel file HW

4_

DataA. Consider the following

data description:

Table

1 .

Data description

Do the following tasks

(

in exact sequence

)

using the

"

4_

DataA" data:

- 1 . [5

marks

]

: Read and display the data given in HW

4_

DataA. Describe both the numeric and

categorical attributes. Refer to Table

1

for the data description.

- 2 . [10

marks:

2.5

each

]

: Do the necessary pre

-

processing. In specific do the following:

.

Normalize the numeric attributes using min

-

max normalization scheme.

.

Perform ordinal

(

label

)

encoding for ordinal attributes

(

,

and Cholestrol

) .

Use dictionary

for the ordinal encoding.

.

Perform one hot encoding for the categorical attribute

(

Sex

)

.

Perorfm label encoding for the class

(

drug

) .

A-3.[10 marks: 2.5 each]: a. Split the dataset into training and

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

answer all questions as instructed below. make sure you have attended all questions .Comparative Architectures (a) Describe the organisation of a two-level branch predictor that makes use of a global...

Why am i not getting the 56.5 i need for the test case to passs and get the desired output below:- my code is below so please look and let me know thanks in advance- 1 EasyCheesy 10 20 12 50 3...

import numpy as np #machine learning tool used for efficient array processing import pandas as pd #machine learning tool used for data sets and data frames from sklearn.model _ selection import train...

Training Datat Please code each answer in python The training dataset for this problem corresponds to the images of handwritten digits '0' and '1' that come prepackaged with the sklearn package....

please do it on python The entire solution needs to be decomposed into several functions where data that needs to be passed between main and other functions should be passed as arguments to the...

Problem statement For this program, you are to implement a simple machine-learning algorithm that uses a rule-based classifier to predict whether or not a particular patient has a coronary heart...

Students are required to submit a Mini Project in the field of data science. The mini project will essentially involve working with a dataset to be imported from from either a CSV or Excel file....

The goal of this project is to detect fraudulent online payments using machine learning methods. You will apply various supervised classification algorithms to identify fraudulent transactions from a...

13 14 15 Question 5. [7 marks] Consider the following dataset related to approving loans to customers: Training Dataset ID Class Has Job Has House Credit History No No Fair No Yes Good Yes No Good...

Q9some type is wrong, can refer to the attach file. In a study of the relationship between an explanatory variable Z and a response variable Y , n independent observations are taken in the form of...

Using a trial-and-error approach, estimate how high the strike price has to be in Problem 12.17 for it to be optimal to exercise the option immediately.

Derive the equations of the deflection curve for a simple beam AB with a distributed load of peak intensity q0 acting over the left-hand half of the span (see figure). Also, determine the deflection...

Question 9 of 1 7 The limit represents a derivative f ' ( a ) . Find f and a . lim h 0 s i n ( 4 + h ) - 2 2 2 h ( Express numbers in exact form. Use symbolic notation and fractions where needed. ) f...

Uber and Lyft what is their branding stragtegy

Explain the global implications for recruitment.

Describe what competencies and competency modeling are.

Summarize job design concepts.