Question: PLEASE HELP ME WITH THIS WHOLE PYTHON PROGRAMMING PROJECT Activity 1: Create Dummy Dataset In this activity, you have to execute the code cell which

PLEASE HELP ME WITH THIS WHOLE PYTHON PROGRAMMING PROJECT

Activity 1: Create Dummy Dataset

In this activity, you have to execute the code cell which creates a dummy dataset for multiclass classification using make_blobs() function of the sklearn.datasets module.

Syntax: make_blobs(n_samples, centers, n_features, random_state, cluster_std)

[ ]

# Run this code cell to generate dummy data using 'make_blobs()' function from sklearn.datasets import make_blobs import pandas as pd features_array, target_array = make_blobs(n_samples = [200, 500, 700, 272, 333], n_features = 2, random_state = 42, centers=None,cluster_std=1) # Creating Pandas DataFrame containing the items from the 'features_array' and 'target_array' arrays. # A dummy dictionary dummy_dict = {'col 1': [features_array[i][0] for i in range(features_array.shape[0])], 'col 2': [features_array[i][1] for i in range(features_array.shape[0])], 'target': target_array} # Converting the dictionary into DataFrame dummy_df = pd.DataFrame.from_dict(dummy_dict) # Printing first five rows of the dummy DataFrame dummy_df.head()

In the above code cell,

A dummy dataset is created having two columns representing two independent variables and a third column representing the target.

The number of records are divided into 5 random groups like [200, 500, 700, 272, 333] such that the target columns has 5 different labels [0, 1, 2, 3, 4].

A dummy DataFrame is created from the two arrays using a Python dictionary. (Learnt in "Logistic Regression - Decision Boundary" lesson)

After this activity, the DataFrame should be created with two independent features columns and one dependent target column.

Activity 2: Dataset Inspection

In this activity, you have look into the distribution of the labels in the target column of the DataFrame.

1. Print the number of occurences of each label in target column.

[ ]

# Display the number of occurrences of each label in the 'target' column.

2. Print the percentage of the samples for each label in target column.

[ ]

# Get the percentage of count of each label samples in the dataset.

Q: How many unique labels are present in the DataFrame? What are they?

After this activity, the labels to be predicted i.e the target variables and their distribution should be known.

Activity 3: Train-Test Split

We need to predict the value of the target variable, using other variables. Thus, target is the dependent variable and other columns are the independent variables.

1. Split the dataset into the training set and test set such that the training set contains 70% of the instances and the remaining instances will become the test set.

2. Set random_state = 42.

[ ]

# Import 'train_test_split' module # Create the features data frame holding all the columns except the last column # and print first five rows of this dataframe # Create the target series that holds last column 'target' # and print first five rows of this series # Split the train and test sets using the 'train_test_split()' function.

3. Print the number of rows and columns in the training and testing set.

[ ]

# Print the shape of all the four variables i.e. 'X_train', 'X_test', 'y_train' and 'y_test'

After this activity, the features and target data should be splitted into training and testing data.

Activity 4: Model Training and Prediction

Implement SVM classification using sklearn module in the following way:

1. Deploy the model by importing the SVC class.

2. Create an object of the SVC class and pass kernel = "linear" as input to its constructor.

3. Call the fit() function of the SVC class on the object created and pass X_train and y_train as inputs to the function.

4. Call the score() function with X_train and y_train as inputs to check the accuracy score of the model.

[ ]

# Build a logistic regression model using the 'sklearn' module. # 1. Create the SVC model and pass 'kernel=linear' as input. # 2. Call the 'fit()' function with 'X_train' and 'y_train' as inputs. # 3. Call the 'score()' function with 'X_train' and 'y_train' as inputs to check the accuracy score of the model.

5. Make the predictions on the train set using predict() function.

[ ]

# Make predictions on the train dataset by using the 'predict()' function. # Print the occurrence of each label computed in the predictions.

Q: Does the model classify all the labels in the training set?

6. Make predictions on the test dataset by using the predict() function.

[ ]

# Make predictions on the test dataset. # Print the occurrence of each label computed in the predictions.

Q: Does the model classify all the labels in the test set?

After this activity, an SVM model should be trained and values of the labels should be predicted for the target columns for multiclass classification.

Activity 5: Model Evaluation

1. Create a confusion matrix to calculate True Positives, False Positives, True Negatives and False Negatives for the test set to evaluate the SVC linear model.

[ ]

# Create a confusion matrix for the test set. # Import the libraries # Print the confusion matrix

Q: Does the confusion matrix indicate any misclassification?

2. Print the classification report to observe the recall, precision and f1-scores for linear SVC model.

[ ]

# Print the classification report for the actual and predicted data of the testing set

Q: What are the f1-scores for all the labels?

After this activity, the model should be evaluated for the target columns using the test features set.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

PLEASE HELP ME COMPLETE THIS WHOLE PYTHON PROGRAMMING PROJECT Activity 1: Create Dummy Dataset In this activity, you have to create a dummy dataset for multiclass classification. The steps to be...

PLEASE HELP ME WITH THIS WHOLE PYTHON PROGRAMMING PROJECT Activity 1: Read the Image file In this activity, you need to execute a code cell which will read the image file from the link given using...

PLEASE HELP ME COMPLETE THIS ENTIRE PYTHON PROGRAMMING PROJECT Activity 1: Analysing the Dataset Create a Pandas DataFrame for Spambase dataset using the below link. This dataset consists of the...

PLEASE HELP ME COMPLETE THIS WHOLE PYTHON PROGRAMMING PROJECT List of Activities Activity 1: Loading and Analysing the Dataset Activity 2: Data Visualization Activity 3: Support Vector Classifier -...

PLEASE HELP ME COMPLETE THESE PYTHON PROGRAMMING ACTIVITIES Activity 1: Create the Dummy Dataset In this activity, you have to create a dummy dataset for multiclass classification. The steps to be...

CHA P TER 9 Understanding Software: A Primer for Managers 1. INTRODUCTION L E A R N I N G O B J E C T I V E S 1. Recognize the importance of software and its implications for the rm and strategic...

*PYTHON COURSE* Can someone please help me answer this question for my class. My professor mentioned it using pandas and I honestly dont know what that is. There are a lot of pictures that I believe...

Hi, Can you please help me with assignment, I am failing to create the train_nn function. Please advise how I can get data to you, my previous efforts have failed. Tensorflow_NeuralNetworkspdf May 1,...

pLEASE po pa help! Reflection po about dto sa mga Framework for Project Management 4. Framework for Project Management Many different professions contribute to the theory and practice of project...

Assume the following information for a company that produced 10,000 units and sold 9,000 units during its first year of operations: Per Unit Per Year Selling price $200 Direct materials $69 Direct...

Nikhil plans to sell toys at a trade fair.He buys thease toys at 55 paise each.The booth rent is $300.The toys will be sold at $ 95 paise each.Find out: (1).how many toys must be sold to break-even?...

Bonds that pay no coupons but are offered at a substantial discount below their par values to provide capital appreciation for investors.

Compared with half a century ago, adoption has become _ _ _ _ _ _ _ _ _ common, but it is more open and acceptabl e , so we probably discuss it _ _ _ _ _ _ _ . fill in the blanks more or much less or...

2. Who are the stakeholders, and how would they define the problem? What are the values of greatest importance to these stakeholders?

1. What is the values dilemma presented in this short case study? What values seem at issue as you define the problem?

2. Hire a part-time horse groomer to care for the horses. There currently were only three horses on the force. The horse groomer could be hired for 4 hours a day to come in and groom the three...