PLEASE HELP IN PYTHON Bagging and boosting are considered ensemble methods of machine learning Ensemble is a concept in which multiple models, also called decision trees, are trained using the same learning algorithm to produce a better predictive performance than using a single decision tree The ensembles take part in a bigger group of methods, called multi classifiers, where a set of hundreds or thousands of learners with a common objective are fused together to solve the problem For this given code, you implement either a bagging or a boosting version to solve the problem of a pharmaceutical drug analysis Fill out the missing code sections titled Write Your Code Here One section has an error stating DataFrame drop ( ) takes from 1 to 2 positional arguments but 3 were given There is an excel file provided for this code Also, try to explain each section of the code on how it works if you can Thank you import pandas as pd import seaborn as sns for plotting from sklearn ensemble import RandomForestClassifier for the model from sklearn tree import DecisionTreeClassifier from sklearn tree import export graphviz plot tree from sklearn model selection import train test split for data splitting np random seed ( 1 2 3 ) ensure reproducibility Let's load the data This is clinical data to predict wether a patient is likely to suufer a heart attack ( 1 ) or not ( 0 ) For each patient ( 3 0 3 in total ) 1 4 variables have been taken into account for the prediction We will use all these data to train a randon forest as a predictor for heart attacks df pd read csv ( ' heart 2 csv ' ) print ( df shape ) df head ( ) Let's have a better look at the data ( visualization ) X np array ( df df columns 0 1 4 ) f , a plt subplots ( 4 , 4 ) f set figheight ( 1 0 ) f set figwidth ( 1 0 ) a a ravel ( ) for idx,ax in enumerate ( a ) if idx 1 4 ax hist ( list ( X , idx ) , color ' r ' ) ax set title ( df columns idx ) plt tight layout ( ) Now, let's train our random forest Not all available data will be used for training, some will be used for testing ( 2 0 ) X train, X test, y train, y test train test split ( df drop ( ' target ' , 1 ) , df ' target ' , test size 2 , random state 1 0 ) model RandomForestClassifier ( max depth 5 ) model fit ( X train, y train ) There is a type error on this sections stating DataFrame drop ( ) takes from 1 to 2 positional arguments but 3 were given Now, let's predict the class ( likely to have an attak ( 0 ) or not ( 1 ) ) for our testing data of patients, using our trained random forest y predict model predict ( X test ) These are the predicted labels by the random forest y predict and these are the actual labels of the testing data y test np array ( list ( y test ) ) y test Let's see the difference between the predicted and the actual labels See that, out of 6 1 predictions, the random forest mistook 1 2 sum ( abs ( y predict y test ) ) Now, for boosting, let's train five random forests Write your code here let's see the predictions for each random forest Write your code here Now, our final prediction will be an average of the predictions of the five random forests If more than two random forests say that the label must be 1 , then, accordingly, the final prediction will be 1 and 0 otherwise Write your code here See that our final ( averaged ) predictions have only 3 mistakes, this is , the boosting does improve indeed the predictions of our initial single model ( 1 2 mistakes ) sum ( abs ( y predict np array ( final ) ) ) Show all images Show all images Show all images done loading

The Answer is in the image, click to view ...

Question: PLEASE HELP IN PYTHON Bagging and boosting are considered ensemble methods of machine learning. Ensemble is a concept in which multiple models, also called decision

PLEASE HELP IN PYTHON

Bagging and boosting are considered ensemble methods of machine learning. Ensemble is a concept in which multiple models, also called decision trees, are trained using the same learning algorithm to produce a better predictive performance than using a single decision tree. The ensembles take part in a bigger group of methods, called multi

-

classifiers, where a set of hundreds or thousands of learners with a common objective are fused together to solve the problem.

For this given code, you implement either a bagging or a boosting version to solve the problem of a pharmaceutical drug analysis. Fill out the missing code sections titled "Write Your Code Here". One section has an error stating: "DataFrame.drop

()

takes from

1

2

positional arguments but

3

were given". There is an excel file provided for this code. Also, try to explain each section of the code on how it works if you can. Thank you.

import pandas as pd

import seaborn as sns #for plotting

from sklearn.ensemble import RandomForestClassifier #for the model

from sklearn.tree import DecisionTreeClassifier

from sklearn.tree import export

_

graphviz #plot tree

from sklearn.model

_

selection import train

_

test

_

split #for data splitting

.

random.seed

(123)

#ensure reproducibility

#Let's load the data. This is clinical data to predict wether a patient is likely to suufer a heart attack

(1)

or not

(0) .

For each patient

(303

in total

) 14

variables have been taken into account for the prediction. We will use all these data to train a randon forest as a predictor for heart attacks:

=

.

read

_

csv

('

heart

2 .

csv

')

(

.

shape

)

.

head

()

#Let's have a better look at the data

(

visualization

)

=

.

array

(

[

.

columns

[0

14]])

,

=

plt

.

subplots

(4, 4)

.

set

_

figheight

(10)

.

set

_

figwidth

(10)

=

.

ravel

()

for idx,ax in enumerate

(

)

if idx

14

.

hist

(

list

(

[

,

idx

]),

color

='

')

.

set

_

title

(

.

columns

[

idx

])

plt

.

tight

_

layout

()

#Now, let's train our random forest. Not all available data will be used for training, some will be used for testing

(20 %)

_

train, X

_

test, y

_

train, y

_

test

=

train

_

test

_

split

(

.

drop

('

target

', 1),

['

target

'],

test

_

size

= . 2,

random

_

state

= 10)

model

=

RandomForestClassifier

(

max

_

depth

= 5)

model.fit

(

_

train, y

_

train

)

#There is a type error on this sections stating: DataFrame.drop

()

takes from

1

2

positional arguments but

3

were given

#Now, let's predict the class

(

likely to have an attak

(0)

or not

(1))

for our testing data of patients, using our trained random forest:

_

predict

=

model.predict

(

_

test

)

#These are the predicted labels by the random forest:

_

predict

. . .

and these are the actual labels of the testing data:

_

test

=

.

array

(

list

(

_

test

))

_

test

#Let's see the difference between the predicted and the actual labels. See that, out of

61

predictions, the random forest mistook

12

sum

(

abs

(

_

predict

-

_

test

))

#Now, for boosting, let's train five random forests:

#Write your code here

let's see the predictions for each random forest:

#Write your code here

#Now, our final prediction will be an average of the predictions of the five random forests. If more than two random forests say that the label must be

1,

then, accordingly, the final prediction will be

1

and

0

otherwise:

#Write your code here

#See that our final

(

averaged

)

predictions have only

3

mistakes, this is

,

the boosting does improve indeed the predictions of our initial single model

(12

mistakes

)

sum

(

abs

(

_

predict

-

.

array

(

final

)))

PLEASE HELP IN PYTHON Bagging and boosting are

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Finance Questions!

1) Specify what cross-validation is used for and with the help of an example explain how it works. 2) For many machine learning problems, multiple valid hypotheses are possible. Specify the factors...

pls solve all 3 photos Search Q 1. Ensemble methods use the output of models as input for another model to make a final prediction Hello, Mar True O False 2. Ensemble methods are not allowed in...

For the purposes of this assignment you will develop both a bagging and boosting ensemble learning model of your choice to produce two dry beans classication models using python. You will then...

i want summary please Survey paper When machine learning meets congestion control: A survey and comparison Huiling Jiang s", Qing Lit ker, Yong Jiang ", , GengBiao Shen ", Richard Sinnott ", Chen...

Chapter 2 User-Centered Systems Design: A Brief History Abstract The intention of this book is to help you think about design from a user-centered perspective. Our aim is to help you understand what...

I need help answering two essay questions which are due 10/18 Weds. You must choose 2 out of 4 essay topics to write on and have a 2 hr time limit once we start. I will provide slides and supporting...

Please provide the summary of the methodology and your understanding of this paper. Incluse necessary figures as well. Rapid Object Detection using a Boosted Cascade of Simple Features single feature...

This paper should include 3-5 pages of content with an additional cover and reference page. This is a total of 5-7 pages. Please be aware that a properly formatted page will include approximately 350...

Jupiter Notebook We have covered some of the limitations of single layer neural networks in class, but they are still powerful learning systems that provide a good way to begin learning about how to...

SUMMARIZE THE ARTICLE BELOW : Teams of people working together for a common purpose have been a centerpiece of human social organization ever since our ancient ancestors first banded together to hunt...

Suppose market regulator requires that related party transaction (transaction between two related parties/firms that have preexisting business relationship) does not need prior approval of the...

What are your recommendations regarding the issue of standardizing process technology across all plants? Are there motives behind the proposal stated in the cases?

Which of the following projects would you choose? Assume the discount rate = 1 0 % Project A : NPV = $ 2 3 0 B : PI = 1 . 2 C : IRR = 9 % D : IRR = 8 % Options: > None > A and B only > A , B , and C...

Using the safe URL below, please answer the following two questions. I need to check my work. Thanks! KLF Electronics is evaluating the potential for implementing periodic review inventory policy;...