Data Visualization import seaborn as sns import matplotlib pyplot as plt sns barplot ( x class , y data class index, palette 'mako', data mushroom data ) The number of poisonous mushrooms is almost twice the number of normal mushrooms There is an imbalance data problem We will be using Matplotlib pyplot and Seaborn to plot our data from sklearn import preprocessing Label encoding is used to convert categorical features to numerical values def label encode fit ( mushroom data, columns ) result mushroom data copy ( ) encoders for column in columns encoder preprocessing LabelEncoder ( ) result column encoder fit transform ( result column ) encoders column encoder return result, encoders data 1 , encoders 1 label encode fit ( data , data columns ) data 1 head ( 1 0 ) def correlation map ( mushroom data, method ) corr mushroom data corr ( method ) ix corr sort values ( ' class ' , ascending False ) index df sorted by correlation mushroom data loc , ix corr df sorted by correlation corr ( method ) plt subplots ( figsize ( 1 8 , 1 4 ) ) with sns axes style ( white ) display a correlation heatmap ax sns heatmap ( corr , annot True ) plt show ( ) correlation map ( data 1 , method spearman ) Gill size has the highest correlation with class It should be included to the model There some highly correlated variables such as , gill color ring type, gill color bruises, bruises stalk surface below ring etc These highly correlated variables ohuld be discarded from the model to obtain more accurate results y data 1 ' class ' contains only class , target, variable X data 1 iloc , 1 contains independent variable from sklearn feature selection import SelectKBest import numpy as np def SelectKBestCustomized ( mushroom data, k , score func, target class ) X mushroom data drop ( columns target ) y mushroom data target np random seed ( 1 2 3 ) for mutual info regression fs SelectKBest ( score func score func, k k ) fs fit ( X , y ) mask fs get support ( ) selected features feature for bool, feature in zip ( mask , X columns ) if bool return selected features from sklearn feature selection import mutual info classif mutual info classif ( X , y , random state 1 2 3 ) mutual info selection SelectKBestCustomized ( data 1 , 9 , mutual info classif ) mutual info selection X new X ' odor ' , 'gill size', 'gill color', 'stalk surface above ring', 'stalk surface below ring', 'stalk color above ring', 'stalk color below ring', 'ring type', 'spore print color' data selected features data 1 ' odor ' , 'gill size', 'gill color', 'stalk surface above ring', 'stalk surface below ring', 'stalk color above ring', 'stalk color below ring', 'ring type', 'spore print color', 'class' a 5 number of rows b 3 number of columns c 1 initialize plot counter fig plt figure ( figsize ( 1 4 , 2 2 ) ) for i in data selected features plt subplot ( a , b , c ) plt title ( ' , subplot ' format ( i , a , b , c ) ) plt xlabel ( i ) sns barplot ( x i , y data selected features i index, palette 'Set 3 r ' , hue class , data data selected features ) c c 1 plt show ( ) THE PYTHON CODE GIVEN ABOVE IS RELATED TO RANDOM FOREST CLASSIFICATION IN THE DATA SCIENCE COURSE PLEASE INTERPRET THIS CODE AND PREPARE A REPORT ACCORDING TO THE SUBJECTS AND CODES

The Answer is in the image, click to view ...

Question: #Data Visualization import seaborn as sns import matplotlib.pyplot as plt sns . barplot ( x = class, y = data [ class ]

#Data Visualization

import seaborn as sns

import matplotlib.pyplot as plt

sns

.

barplot

(

=

"class", y

=

data

["

class

"] .

index, palette

=

'mako', data

=

mushroom

_

data

)

#The number of poisonous mushrooms is almost twice the number of normal mushrooms. There is an imbalance data problem.

#We will be using Matplotlib pyplot and Seaborn to plot our data.

% %

from sklearn import preprocessing

#Label encoding is used to convert categorical features to numerical values.

def label

_

encode

_

fit

(

mushroom

_

data, columns

)

result

=

mushroom

_

data.copy

()

encoders

= {}

for column in columns:

encoder

=

preprocessing.LabelEncoder

()

result

[

column

] =

encoder.fit

_

transform

(

result

[

column

])

encoders

[

column

] =

encoder

return result, encoders

% %

data

1,

encoders

1 =

label

_

encode

_

fit

(

data

,

data.columns

)

data

1 .

head

(10)

% %

def correlation

_

map

(

mushroom

_

data, method

)

corr

=

mushroom

_

data.corr

(

method

)

=

corr.sort

_

values

('

class

',

ascending

=

False

) .

index

_

sorted

_

_

correlation

=

mushroom

_

data.loc

[

,

]

corr

=

_

sorted

_

_

correlation.corr

(

method

)

plt

.

subplots

(

figsize

= (18, 14))

with sns

.

axes

_

style

("

white

")

# display a correlation heatmap

=

sns

.

heatmap

(

corr

,

annot

=

True

)

plt

.

show

()

% %

correlation

_

map

(

data

1,

method

=

"spearman"

)

#Gill

_

size has the highest correlation with class. It should be included to the model.

#There some highly correlated variables such as

,

gill

-

color & ring

-

type, gill

-

color & bruises, bruises & stalk

-

surface

-

below

-

ring etc. These highly correlated variables ohuld be discarded from the model to obtain more accurate results.

% %

=

data

1 [['

class

']]

# contains only "class", target, variable.

=

data

1 .

iloc

[

, 1

]

# contains independent variable.

% %

from sklearn.feature

_

selection import SelectKBest

import numpy as np

def SelectKBestCustomized

(

mushroom

_

data, k

,

score

_

func, target

=

"class"

)

=

mushroom

_

data.drop

(

columns

=

target

)

=

mushroom

_

data

[

target

]

.

random.seed

(123)

# for mutual

_

info regression

=

SelectKBest

(

score

_

func

=

score

_

func, k

=

)

.

fit

(

,

)

mask

=

.

get

_

support

()

selected

_

features

= [

feature for bool, feature in zip

(

mask

,

.

columns

)

if bool

]

return selected

_

features

% %

from sklearn.feature

_

selection import mutual

_

info

_

classif

mutual

_

info

_

classif

(

,

,

random

_

state

= 123)

% %

mutual

_

info

_

selection

=

SelectKBestCustomized

(

data

1, 9,

mutual

_

info

_

classif

)

% %

mutual

_

info

_

selection

% %

_

new

=

[['

odor

',

'gill

-

size',

'gill

-

color',

'stalk

-

surface

-

above

-

ring',

'stalk

-

surface

-

below

-

ring',

'stalk

-

color

-

above

-

ring',

'stalk

-

color

-

below

-

ring',

'ring

-

type',

'spore

-

-

color'

]]

% %

data

_

selected

_

features

=

data

1 [['

odor

',

'gill

-

size',

'gill

-

color',

'stalk

-

surface

-

above

-

ring',

'stalk

-

surface

-

below

-

ring',

'stalk

-

color

-

above

-

ring',

'stalk

-

color

-

below

-

ring',

'ring

-

type',

'spore

-

-

color',

'class'

]]

% %

= 5

# number of rows

= 3

# number of columns

= 1

# initialize plot counter

fig

=

plt

.

figure

(

figsize

= (14, 22))

for i in data

_

selected

_

features:

plt

.

subplot

(

,

,

)

#plt

.

title

(' {},

subplot:

{} {} {}' .

format

(

,

,

,

))

plt

.

xlabel

(

)

sns

.

barplot

(

=

,

=

data

_

selected

_

features

[

] .

index, palette

=

'Set

3_

',

hue

=

"class", data

=

data

_

selected

_

features

)

=

+ 1

plt

.

show

()

THE PYTHON CODE GIVEN ABOVE IS RELATED TO RANDOM FOREST CLASSIFICATION IN THE DATA SCIENCE COURSE.

PLEASE INTERPRET THIS CODE AND PREPARE A REPORT ACCORDING TO THE SUBJECTS AND CODES.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

USE THIS MODEL TO ANSWER THE UPCOMING QUESTIONS MODEL: import numpy as np import pandas as pd import tensorflow as tf from tensorflow.keras import layers, models from sklearn.metrics import confusion...

Hyperparameter Tuning section of the code not working import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Import numpy and give it the alias np import numpy as np # Load...

Edit question Here are the draft of my Group Project, title is :The impacts of a well-balanced diet on immunity in combating the COVID-19 virus in various countries How many countries adhere to the...

Here are the draft of my Group Project, title is :The impacts of a well-balanced diet on immunity in combating the COVID-19 virus in various countries we are solvong the 3 questions: 1.How many...

what is the answer import seaborn as sns import matplot lib.pyplot as plt planets = sns. Load_dataset ('plane ts') planets,plot (kind = 'scatter', x ='mass', y ='orbital_period', = 'number', colormap...

Only highlighted portion is needed. should be a one line numpy code of the function up top. Assignment 8 The Kozeny-Carmen (K-C) relationship is a model that relates porosity to permeability through...

USE JUPYTER LAB, below is the provided code and at the end are the questions: import numpy as np import pandas as pd import seaborn as sns import math from sklearn import preprocessing from sklearn...

Define the null and alternative hypothesis in mathematical terms and in words. Report the level of significance. Include the test statistic and the P-value. See Step 2 in the Python script. Provide...

please help me with the code (python)for questions 3 and 4. The previous images help with three and four Multiple sclerosis (MS) involves an immune-mediated process in which an abnormal response of...

The Kozeny-Carmen (K-C) relationship is a model that relates porosity to permeability through a proportionality constant 3(1 )2= ( ) The file poro_perm.csv contains two columns of data corresponding...

This comprehensive problem involving Miller Design Studio covers all the learning objectives in this chapter and in the chapters on measuring business transaction and measuring business income. To...

Indicate whether each of the following statements is consistent with an organic or mechanistic view of government: a. If you want to believe in a national purpose that is greater than our individual...

Serve as an intermediary by coordinating a transaction between financial market participants wanting to engage in various swaps.

The first scenario will be a Verbal Judo scenario in which your scenario follows the standard Verbal Judo interaction: You need to ask somebody to modify their behavior either to do something or to...

C Does self-policing work on the Internet? What circumstances might inhibit a groups ability to selfpolice?

A Do you participate in Internet forums? Do you prefer moderated or open forums? What makes you prefer one over the other?

B Which is more important, a free-speech open forum or a managed, productive conflict? Do you think its necessary to trade off one for the other?