Question: Hyperparameter Tuning section of the code not working import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Import numpy and give

Hyperparameter Tuning section of the code not working

import pandas as pd

import seaborn as sns

import matplotlib.pyplot as plt

# Import numpy and give it the alias np

import numpy as np

# Load dataset

data

=

.

read

_

csv

(' /

content

/

drive

/

MyDrive

/

Cancer

_

Data.csv

')

# Display basic information

(

data

.

info

())

(

data

.

describe

())

# Visualize class distribution

sns

.

countplot

(

=

'diagnosis', data

=

data

)

plt

.

title

('

Distribution of Malignant and Benign Tumors'

)

plt

.

show

()

# Handle missing values

# Exclude non

-

numeric columns from mean calculation

numeric

_

data

=

data.select

_

dtypes

(

include

=

.

number

)

# Replace infinite values with NaN

numeric

_

data.replace

([

.

inf,

-

.

inf

],

.

nan, inplace

=

True

)

# Calculate mean without infinite values

# Check if there are any columns with all values as NaN after replacing inf

for col in numeric

_

data.columns:

if numeric

_

data

[

col

] .

isnull

() .

all

()

# Handle columns with all NaN values

-

here, we drop the column

numeric

_

data.drop

(

col

,

axis

= 1,

inplace

=

True

)

data.drop

(

col

,

axis

= 1,

inplace

=

True

)

else:

data

[

col

] =

numeric

_

data

[

col

] .

fillna

(

numeric

_

data

[

col

] .

mean

())

# Encode categorical variables

data

['

diagnosis

'] =

data

['

diagnosis

'] .

map

({'

'

1,'

'

0})

# M

=

malignant, B

=

benign

#Normalize numerical features to ensure they contribute equally to distance calculations in SVC and Random Forest.

from sklearn.preprocessing import StandardScaler

scaler

=

StandardScaler

()

features

=

data.drop

('

diagnosis

',

axis

= 1)

#Splitting the Data

from sklearn.model

_

selection import train

_

test

_

split

_

train, X

_

test, y

_

train, y

_

test

=

train

_

test

_

split

(

features

,

data

['

diagnosis

'],

test

_

size

= 0.2,

random

_

state

= 42)

#Fit and transform the scaler on the training data only

_

train

_

scaled

=

scaler.fit

_

transform

(

_

train

)

#Transform the test data using the scaler fit on the training data

_

test

_

scaled

=

scaler.transform

(

_

test

)

#Hyperparameter Tuning

from sklearn.model

_

selection import GridSearchCV

# Define parameter grids

param

_

grid

_

svc

= {'

'

[0.1, 1, 10],

'kernel':

['

linear

','

rbf

']}

param

_

grid

_

= {'

_

estimators':

[50, 100],

'max

_

depth':

[

None

, 10]}

# Create GridSearchCV objects

grid

_

svc

=

GridSearchCV

(

SVC

(),

param

_

grid

_

svc

,

= 5)

grid

_

=

GridSearchCV

(

RandomForestClassifier

(),

param

_

grid

_

,

= 5)

# Fit models with grid search

grid

_

svc

.

fit

(

_

train, y

_

train

)

grid

_

.

fit

(

_

train, y

_

train

)

#Model Evaluation

#Import the SVC class

from sklearn.svm import SVC

#Create an SVC model

svc

_

model

=

SVC

()

#Train the model

svc

_

model.fit

(

_

train

_

scaled, y

_

train

)

# Import the RandomForestClassifier class

from sklearn.ensemble import RandomForestClassifier

# Create a RandomForestClassifier model

_

model

=

RandomForestClassifier

(

random

_

state

= 42)

# Add a random state for reproducibility

# Train the model

_

model.fit

(

_

train

_

scaled, y

_

train

)

from sklearn.metrics import accuracy

_

score, classification

_

report

# Predictions from each model

svc

_

pred

=

svc

_

model.predict

(

_

test

_

scaled

)

# Predict on the scaled test data

_

pred

=

_

model.predict

(

_

test

_

scaled

)

# Predict on the scaled test data

# Evaluate models

("

SVC Classification Report:

",

classification

_

report

(

_

test, svc

_

pred

))

("

Random Forest Classification Report:

",

classification

_

report

(

_

test, rf

_

pred

))

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Data Science, Python, Jupyter Notebook I have a term project for my Capstone class in Data Science. Below is the syllabus, dataset, and the Jupiter Notebook. I am creating a Classification model to...

I have a code for a neural network model. I want to ensure that it will be suitable for my project topic, which is (Ransomware attack detection using deep learning (CNNs).or not.? if yes, could you...

I am working on project one for MAT243 Applied Statistics. It is working in Codio in Jupyter notebook. I got all my codes to work except step 9. Can you help me withthe code? I keep getting syntax...

BACKGROUND: You are a data analyst for a basketball team and have access to a large set of historical data that you can use to analyze performance patterns. The coach of the team and your management...

#libraries for data manipulation import numpy as np import pandas as pd #libraries for data visualization import matplotlib.pyplot as plt import seaborn as sns % matplotlib inline #to remove warning...

Hello, I am a bit stuck on my assignment this week. I believe I have figured out steps 1-3. I am a bit stuck on 4-6. Any help would be appreciated. " This notebook contains the step-by-step...

import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.model _ selection import train _ test _ split from sklearn.utils import resample from sklearn.naive _ bayes import...

This is Everything they gave us. They want us to write the code: The dataframe for your team is called your_team_df. The variable 'pts' represents the points scored by your team. Calculate and print...

I need to get proper descriptions for all the "in"s and "out"s of this python notebook for a presentation. Try opening images in a new tab or window to see the text clearly. HMEQ_Data Predict clients...

This code block works with data.all-data but I don't understand why does not accept data.csv file. I don't know how to use phyton well. I would be glad if you could help. Code Block: # -*- coding:...

The following are selected 2014 transactions of Palmeiro Corporation. Sept. 1 Purchased inventory from Ripken Company on account for $125,000. Palmeiro records purchases gross and uses a periodic...

The 4th and 10th terms of an AP are 13 and 31 respectively. Find the values of 'a' and 'd'. Hence, find the sum of the 1st 22 terms.

7. LESLIE: Your friend Paul told us that he would be visiting his parents in Knoxville this weekend. Th erefore, he must not be at home. DIANA: I agree that Paul is probably not at home, but you...

which one of the following answers is the best link between deforesting a hillside and increased floodingg and pollution in local streams