Question: Q 4 . Support Vector Machines ( SVM ) is a supervised learning algorithm, which can be applicable to both classification and regression. The data

4 .

Support Vector Machines

(

SVM

)

is a supervised learning algorithm, which can be applicable to both classification and regression. The data set provided contains normal and fraudulent transactions in excel file

Week

6 -

Fraud

_

data

in sheets

FraudTrain

and

FraudTest

.

Using Support Machine Model

(

SVM

),

predict whether the transactions are Normal or Fraudulent based on the features of the transactions in the dataset.

Use is

_

fraud as target variable and features as independent variables. Convert all textual categorical variables to numeric, and clean data if necessary. For SVM model, add Vaimal Machine Learning Add

-

.

(

Note: Install Vaimal Add

-

in attached with the assignment using installation instructions below and detailed instruction in the Manual.

)

Please follow below process for model development:

1 .

Import or load Data: Place data in an Excel worksheet.

2 .

Perform Data Preprocessing: Deal with missing data, data normalization, and encoding categorical inputs.

(

Vaimal has several utilities for preprocessing data such as Data Manager.

)

3 .

Select a Model: To use and design it

.

Select which model to use and the design parameters.

4 .

Train the Model: Using training data with known outputs, train the model.

5 .

Test the Model Using different data than the training data, test the model

s ability to predict versus known outputs.

6 .

Prediction: Use the model to make predictions of data with unknown output.

Input Variables:

trans

_

date

_

trans

_

time, cc

_

num, first, last, merchant, category, amt, gender, street, city, state, zip, lat, long, city

_

pop, job, unix

_

time, merch

_

lat, and merch

_

long. Please convert categorical variables to Numbers such as Gender, and category etc. You can drop unwanted columns and use below columns as INPUT columns:

Output Variables:

Please use column

_

fraud

as a target variable.

Data Flag for Training or Testing:

The column

Data

_

FLAG

differentiates the data as Training and Testing. Please supply input to the model as per inputs.

.

Provide descriptive measures of column

amt

: count, Min, Max, Mean, and Std

.

deviation. Write the count of fraudulent and normal transactions.

.

Show graph of Normalized frequency for columns category and Is

_

fraud to show frequency for normal and fraudulent transactions.

.

Perform Error analysis using confusion matrix, which is created with four categories

-

true positives

(

),

false positives

(

),

true negatives

(

),

and false negatives

(

) .

Precision

(

) =

/ (

+

)

Recall

(

) =

/ (

+

)

Accuracy

= (

+

) / (

+

+

+

)

1

score

= (2 *

*

\ / (

+

)

.

Based on the features below, predict the status of the transaction if it is Fraud:

trans

_

date

_

trans

_

time:

12 / 2 / 2020 22

27

_

num:

3.588

+ 15

merchant: fraud

_

Torphy

-

Goyette

category: shopping

_

pos

amt:

1318.89

first: Jason, last: Johnson, gender: M

street:

5942

Thomas Park, city: Craig

state: AK

,

zip:

99921

lat:

55.4732,

long:

- 133.1171,

city

_

pop:

1920

job: Commissioning editor, dob:

6 / 17 / 1997

trans

_

num:

2682

81

3

9

070

7

abc

721

4

5862

unix

_

time:

1386023256

merch

_

lat:

54.801713,

merch

_

long:

- 133.669108

_

fraud:

1

Q4. Support Vector Machines (SVM) is a supervised learning algorithm, which

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Your code will read in an email message in some standard format (we will determine that standard) and will classify whether that email is a spam or non-spam email. There are databases containing...

1. Write a report about Data Mining for following perspectives a. Supervised vs unsupervised b. Data mining methods include (classification, regression, clustering) 2. Classification, regression and...

Requirements Read the give information deeply and Drawing conclusions refers to information that is implied or inferred. ... Using these clues to give for deeper understanding And provide the details...

provide the details Conclusions 2. FINANCIAL FORECASTING AND ALGORITHMS FOR PREDICTION Regression analysis is one of the most widely used techniques for analyzing multifactor data. Linear regression...

from the above mentioned study provide the conclusions 2. FINANCIAL FORECASTING AND ALGORITHMS FOR PREDICTION Regression analysis is one of the most widely used techniques for analyzing multifactor...

( a ) K - Nearest Neighbors ( KNN ) Mathematical Background: KNN is a simple, non - parametric algorithm used for classification and regression. It works by identifying the K nearest data points to a...

In [ ] : def evaluate _ knn _ classifier ( X _ train, y _ train, X _ test, y _ test, best _ k ) : Evaluates the KNN classifier on the test set with the given best ' K ' value. Parameters: X _ train:...

Data Analytic Question: Select any dataset that contains at least 200 observations and at least 5 attributes. Choose ONE (1) target variable among the available variables. Given the following 2...

The cylinder on a lathe has a radius of 2cm and the radius of the handlebar is 40 cm. Calculate the load that can be balanced with a force of 25N.

The interest charged on a $50,000 note payable, at the rate of 6%, on a 90-day note would be: a. $3,000. b. $1,500 c. $750. d. $500.

If production increases by 1 5 % , how will total variable costs likely react? Question 2 0 options: Increase by 7 . 5 % Increase by 1 5 % Decrease by 1 5 % Remain the same

An offer to give a reward is an offer to make a contract that is ( one word ) :

=+1. Country A and country B both have the production function Y = F(K, L) = K1/2L1/2.

=+ 4. In the Solow model, how does the rate of population growth affect the steady-state level of income?

=+ With less capital than in the Golden Rule steady state? Explain your answers.