Question: USE JUPYTER LAB, below is the provided code and at the end are the questions: import numpy as np import pandas as pd import seaborn
USE JUPYTER LAB, below is the provided code and at the end are the questions:
import numpy as np
import pandas as pd
import seaborn as sns
import math
from sklearn import preprocessing
from sklearn import datasets
from sklearn.tree import plottree
from sklearn.tree import exporttext
from sklearn.tree import DecisionTreeClassifier
from sklearn import metrics #Import scikitlearn metrics module for accuracy calculation
from sklearn.metrics import confusionmatrix, ConfusionMatrixDisplay
from sklearn.modelselection import traintestsplit
from sklearn.datasets import makemoons
from sklearn.ensemble import RandomForestClassifier
import sklearn
from scipy import stats
import matplotlib
import matplotlib.pyplot as plt
matplotlib inline
matplotlib.style.useggplot
nprandom.seed
Loading the digits dataset classification
The digists dataset has records
Each record is a x image dimensions and there are class labels for this dataset
Each image record is labeled by the number it represents
The intensities of the original pixels are binned to values ranging from to
from sklearn.datasets import loaddigits
digits loaddigits
X digits.data
y digits.target
#Learning about how the data is stored
printXshape", Xshape,
yshape",yshape
printX: #Check the values for the first two images
printy: #Print the class labels for the first two images
#Show the first image
pltgray
pltmatshowX:reshape #show the first image, first reshape the values vector into an x matrix
pltshow
#plotting the first images
fig axes pltsubplotsnrows ncols figsize
for id ax in enumerateaxesflatten:
image Xid:reshape
axsetaxisoff
#aximshowimage cmappltcmgrayr #You can try this and comment the line below
aximshowimage cmap'gray'
axsettitleLabel: i yid fontsize
plttightlayout
pltshow
# Split data into train and test subsets
#Xtrain, Xtest, ytrain, ytest traintestsplitX y testsize shuffleTrue, randomstate
Xtrain, Xtest, ytrain, ytest traintestsplitX y testsize shuffleFalse
printTraining Data",Xtrain.shape
printTesting Data",Xtest.shape
counts, bins nphistogramytest
printNumber of records in each class", counts
pltstairscounts bins
QA Train a decision tree on the training data and report the training and testing accuracy of the decision tree.
QB Plot the first images in the testing datasets.
The title of each subfigure should be True: label Predicted: label
QC Plot the first images in the testing datasets that were misclassified.
The title of each subfigure should be True: label Predicted: label
QD Print the classification report using classificationreport from metrics in sklearn
QE Plot the confusion matrix using ConfusionMatrixDisplay
QF points Plot the decision tree using plottree
QG Cross Validation
Report the accuracies for the fold cross validation use cv
The cross validation method takes the decision tree model, the entire dataset, and the class labels.
For this line:
printf accuracy with a standard deviation of fscoresmean scores.std
this is a sample output
accuracy with a standard deviation of
QH Random Forest Classifier
Train a random forest on Xtrain and report the accuracy on Xtest
Use trees in the random forest classifier. Recall that number of records in Xtrain
Finetune the maxsamples try different numbers for RandomForestClassifier
to achieve an accuracy higher than a big improvement from the
Q Finding the best split using gini index
data nparray
printValues
data::
printClass Label",data:
n data.shape
d data.shape #number of columns, ignore the last column class label
QA points Write a function that computes the giniindex of a dataset D
Use math.powerPpositiven to calculate Ppositiven
If the data has zero records, the giniindex is zero The last column of the dataset is the class label
#Write a function that computes the giniindex for a dataset
#PpositivenPnegativen
#use math.powerPpositiven to calculate Ppositiven
#If the data has zero records, the giniindex is zero
#The last column of the dataset is the class label
def getginiindexD:
n Dshape
giniindex "calculate it
#Write your code here
returnginiindex
printgetginiindexdata #You should get
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
