Question: PYTHON LANGUAGE ONLY.In this assignment, you will need to continue working on the Microsoft Malware Prediction problem. Here is the link to download data from

PYTHON LANGUAGE ONLY.In this assignment, you will need to continue working on the Microsoft Malware Prediction problem. Here is the link to download data from Kaggle. Please 1) Load the data set into a pandas dataframe and see how many variables in the data set, and what are their data types. Since the size of the dataset is too big for your memory size, you can try to read a small sample like 1000 records using the following code:
pd.read_csv("train.csv", nrows =1000)
2) Examine data types of the variables
3) Shows the top 5 rows of the data frame
4) Encode string values (if any) to integers
5) Once again, examine data types of the variables
6) Produce some histograms of the variables
7) You need to provide analysis of the missing value percentage in each variable. You can use the following code:
Panda_dat8) You need to show the total number of missing values in all variables using the following code:
#The sum of the missing values in each variabledataset.isnull().sum()
Panda_dataframe .isnull().sum().sum()
9) Perform missing value imputation as we explained in the class and verify again that you don't have any missing values in the dataset using this code
#The sum of the missing values in each variabledataset.isnull().sum()
Panda_dataframe .isnull().sum().sum()
Second, modeling:
10) Split the data into training and testing sets (80-20)
11) Build different machine learning models like Decision Tree, Support Vector Machine, and Naive Bayes. Show the performance of the model (f1, accuracy, precision, and recall). Which model is the best? and why?
12) Experiment with different train-test split ratios and observe how they affect the model's performance.aframe.isnull().sum()make sure to download the train.csv file.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!