Question: Supervised Machine Learning using ScikitLearn Classifiers Data : There are 3 CSV files: hw3.train.csv , hw3.test.csv , and hw3.new.csv . The CSV
Supervised Machine Learning using ScikitLearn Classifiers
Data:
There are 3 CSV files: "hw3.train.csv", "hw3.test.csv", and "hw3.new.csv".
The CSV file "hw3.train.csv" contains 50,000 rows and 51 columns. The first column 'y' is the output variable with 3 classes: 0, 1, 2. The remaining 50 columns contain input features: x1, x2, ... , x50.
The CSV file "hw3.test.csv" contains 10,000 rows and 51 columns. The first column 'y' is the output variable with 3 classes: 0, 1, 2. The remaining 50 columns contain input features: x1, x2, ... , x50.
The CSV file "hw3.new.csv" contains 100 rows and 51 columns. The first column 'ID' is an identifier for the 100 unlabeled samples. The remaining 50 columns contain input features: x1, x2, ... , x50.
HERE ARE THE LINKS FOR THE CSV FILES:
If you select download and then direct download, the file will open. It won't show a preview of the document.
https://www.dropbox.com/s/wsw0xbhezth00qr/hw3.new.csv?dl=0 (New CSV)
https://www.dropbox.com/s/bf1z3y34bozitly/hw3.train.csv?dl=0 (Train CSV)
https://www.dropbox.com/s/xz2vlxwjeog3hjq/hw3.test.csv?dl=0 (Test CSV)
Task 0.
View the data from the CSV files "hw3.train.csv", "hw3.test.csv", and "hw3.new.csv" into pandas dataframes train, test, and new, respectively. Confirm that the dataframes contain the correct number of rows and columns.
Document the class distribution of 'y' in train and test by specifying the proportion of examples in each class. Round the proportions to 4 decimal places.
Class distribution of y:




\f\f\f\f
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
