Question: could someone help me change all the names and function names and make this code very unique since i get high plagrism import pandas

could someone help me change all the names and function names and make this code very unique since i get high plagrism

import pandas as pd

import numpy as np

from sklearn.neighbors import KNeighborsClassifier

from sklearn.tree import DecisionTreeClassifier

from sklearn.ensemble import RandomForestClassifier

from sklearn.svm import SVC

from sklearn.preprocessing import StandardScaler

from sklearn.model_selection import train_test_split, GridSearchCV

from sklearn.metrics import confusion_matrix, recall_score, precision_recall_curve, auc

import matplotlib.pyplot as plt

import seaborn as sns

# Load the dataset

data_path = r'C:\Users\john3\Desktop\cyber security analytics sit 384\10.1HD\creditcard.csv'

data = pd.read_csv(data_path)

# Preprocess the dataset

scaler = StandardScaler()

data['scaled_amount'] = scaler.fit_transform(data['Amount'].values.reshape(-1, 1))

data['scaled_time'] = scaler.fit_transform(data['Time'].values.reshape(-1, 1))

data.drop(['Time', 'Amount'], axis=1, inplace=True)

# Split the dataset into train and test sets

X = data.drop('Class', axis=1)

y = data['Class']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

# Create the undersampled dataset

fraud_indices = np.array(data[data.Class == 1].index)

normal_indices = np.array(data[data.Class == 0].index)

undersample_size = len(fraud_indices)

random_normal_indices = np.random.choice(normal_indices, undersample_size, replace=False)

random_normal_indices = np.array(random_normal_indices)

undersampled_indices = np.concatenate([fraud_indices, random_normal_indices])

undersampled_data = data.iloc[undersampled_indices, :]

X_undersampled = undersampled_data.drop('Class', axis=1)

y_undersampled = undersampled_data['Class']

# Split the undersampled dataset into train and test sets

X_train_undersample, X_test_undersample, y_train_undersample, y_test_undersample = train_test_split(X_undersampled, y_undersampled, test_size=0.3, random_state=0)

def print_gridsearch_scores(clf, param, x_train, y_train):

grid_clf = GridSearchCV(clf, param, scoring='recall', cv=5)

grid_clf.fit(x_train, y_train)

print(f"Best parameters: {grid_clf.best_params_}")

print(f"Best score: {grid_clf.best_score_}")

return grid_clf.best_params_

def plot_confusion_matrix(cm, title):

sns.heatmap(cm, annot=True, cmap="YlGnBu", fmt='d', linewidths=.5)

plt.xlabel("Predicted")

plt.ylabel("True")

plt.title(title)

def predict_plot_test(clf, x_train, y_train, x_test, y_test):

clf.fit(x_train, y_train)

y_pred = clf.predict(x_test)

cm = confusion_matrix(y_test, y_pred)

plot_confusion_matrix(cm, f"Confusion Matrix for {clf.__class__.__name__}")

def plot_recall_for_threshold(clf, x_train, y_train, x_test, y_test, thresholds):

clf.fit(x_train, y_train)

y_pred_proba = clf.predict_proba(x_test)[:, 1]

recalls = []

for t in thresholds:

y_pred = (y_pred_proba >= t).astype(int)

recalls.append(recall_score(y_test, y_pred))

plt.plot(thresholds, recalls)

plt.xlabel("Threshold")

plt.ylabel("Recall")

plt.title(f"Recall for different thresholds for {clf.__class__.__name__}")

def plot_precision_recall(clf, x_train, y_train, x_test, y_test):

clf.fit(x_train, y_train)

y_pred_proba = clf.predict_proba(x_test)[:, 1]

precision, recall, _ = precision_recall_curve(y_test, y_pred_proba)

pr_auc = auc(recall, precision)

plt.plot(recall, precision)

plt.xlabel("Recall")

plt.ylabel("Precision")

plt.title(f"Precision-Recall curve for {clf.__class__.__name__} (AUC = {pr_auc:0.2f})")

# Parameters for classifiers

knn_params = {'n_neighbors': [1, 2, 3, 4, 5]}

dt_params = {'max_leaf_nodes': [10, 15, 20, 25, 30]}

rf_params = {'n_estimators': [5, 10, 20, 50]}

svc_params = {'gamma': [0.001, 0.01, 0.1, 1, 10], 'C': [0.01, 0.1, 1, 10, 100]}

# Initialize classifiers

knn = KNeighborsClassifier()

dt = DecisionTreeClassifier(random_state=0)

rf = RandomForestClassifier(random_state=0)

svc = SVC(random_state=0, probability=True)

# Perform the tasks for each classifier

classifiers = [

(knn, knn_params),

(dt, dt_params),

(rf, rf_params),

(svc, svc_params)

]

thresholds = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]

for clf, params in classifiers:

best_params = print_gridsearch_scores(clf, params, X_train_undersample, y_train_undersample)

clf.set_params(**best_params)

plt.figure()

predict_plot_test(clf, X_train_undersample, y_train_undersample, X_test_undersample, y_test_undersample)

plt.figure()

plot_recall_for_threshold(clf, X_train_undersample, y_train_undersample, X_test_undersample, y_test_undersample, thresholds)

plt.figure()

plot_precision_recall(clf, X_train_undersample, y_train_undersample, X_test_undersample, y_test_undersample)

plt.show()

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Wagenlucht Ice Cream Company is always trying to create new flavors of ice cream. They are market testing three kinds to find out which one has the best chance of becoming popular. They give small...

network. Consider the following model for a growing simple We adopt the following notation: N and L indicate respectively the total number of nodes and links of the network, A,, indicates the generic...

What may be two specific effects of advertising on children? What may be the two cognitive problems that young children have with processing advertising messages?

I do not have access to Python or excel/office Covid sick cant access school computer last time getting error please help No matter how much I access and cleanData1 I always get an error on every...

2.1-2.4 (dataset is given below) 2.1) Observe the histogram of the global sales plotted below and try to reproduce it. Specs of the plot: 1. plot title: "Histogram of Global Sales 2. X-label: Global...

here is the dataset https://www.kaggle.com/rush4ratio/video-game-sales-with-ratings 1.c Data Sanity Checks 1.c.1) It is important to check if there are any internal inconsistencies within the...

import warnings warnings.filterwarnings("ignore") import os import pandas as pd import numpy as np import matplotlib.pyplot as plt np.random.seed(0) # Do not change this value: required to be...

#PLease Answer in Python Codeblock import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns Use pd.read_csv to read the cereals_data.csv file into a DataFrame...

Question 2 : generate _ descriptors ( ) Write a function called generate _ descriptors ( df ) which takes in the DataFrame you loaded from Question 1 and returns a dataframe of all unique ticket...

Solve this on Python Jupyter Notebook please! The Names.csv looks like this: And the Salaries.csv looks like this: In this HW, we will study the pay gap between men and women who have jobs in San...

Describe the photoelectric effect and four aspects of the experimental results that were puzzling to nineteenth century physicists. How does the photon model of light explain the experimental results...

UC parking services is looking at price discounts for plastic tag holders parking vouchers for guests. If they know that the annual demand is 36,000 and that the order costs for this with the UC logo...

List the five - steps that you can use when mediating a conflict.

Dunne, Inc., a U.S. corporation, earned $500,000 in total taxable income, including $50,000 in foreign-source taxable income from its branch manufacturing operations in Brazil and $20,000 in...

Brandy, a U.S. corporation, operates a manufacturing branch in Chad, which does not have an income tax treaty with the United States. Brandy's worldwide Federal taxable income is $30 million, so it...

What is the value of a preferred stock where the dividend rate is 14 percent on a $100 par value, and the markets required yield on similar shares is 12 percent?

Hart Enterprises, a domestic corporation, owns 100% of OK, Ltd., an Irish corporation. OK's gross income for the year is $10 million. Determine whether any of the following transactions produce...

Manson Industries incurs unit costs of $8 ($5 variable and $3 fixed) in making a sub-assembly part for its finished product. A supplier offers to make 10,000 of the assembly part at $6 per unit. If...

Which of the following is considered a non - frivolous agency cost? Buying first - class airline tickets for personal vacations Heing an external milice to certaby fromedal thatments Hosting lavish...

On June 16, 2016, the worlds biggest Disneyland opened in Shanghai with a great deal of fanfare. It features a supersized 200-foot-tall castle. In comparison, the height for similar castles in...

Given that most entrepreneurial start-ups fail, why do entrepreneurs found so many new firms? Why are (most) governments interested in promoting more start-ups?

Multinational enterprises from emerging economies (EMNEs) have recently emerged as a new breed of acquirers around the world. In comparison with acquirers from developed economies, two unique and...

2. How do we perceive middle-frequency sounds (100 to 4000 Hz)?

20. What is a feature detector?

22. When you wiggle your eyes back and forth, why dont you see a blur?