Question: nodo ) , upl 1 t , n , 1 0 ss , yval, ( yprob ) donotes terainal nodo root 2 7 5 9

nodo), upl1t, n,10ss, yval, (yprob)
donotes terainal nodo
root 27595 Good (0.34545450.6545455)
ChockingAccount3tatus.noner 0.516177 Good (0.47B26090.5217391)
Aaount)=8760.5,17,0 Bad (1.00000000.0000000)*
Aaounte 8760.514460 Good (0.41666670.5833333)
ChockingAccount3tatus.gt.2000.512057 Good (0.47500000.5250000)
Durat1on =22.54818 Bad (0.62500000.3750000)
MubberPoopleMaintenancer 1.53710 Bad (0.T2972970.2702703)*
MurberPoopleMaintenance>-1.5113 Good (0.27272730.7272727)*
Durations 22.57227 Good (0.37500000.6250000)
kaount 1282,25,11 Bad (0.56000000.4900000)
Property. PoalEstates 0.5,17,5 Bad (0.70589240.2941176)*
Property. RoalEstate =0.5 & 2 Good 0.7500000)*
Raount)=12824713 Good (0.27659570.7234043)*
CheckingAccount3tatus.gt.200=0.5243 Good (0.12500000.g750000)*
ChockingAccount3tatus.nonop=0.511418 Good (0.15789470.8421053)*
Is the sensitivity messure of the classification tree on these data equal to
0.8088? If no, what is the sensitivity mensure? Justify your answer.
(d) Given a collection of competing clasifiers and some dntn, does the wee of
a validation set to select the best one in the collection gunrantee that the
selected one will alwnys peovide the best predictive performance on future
unseen observations? Remernber to justify your nnswer.
(e) Two hundred labeled samples are used to trnin two binary clussifiers M1 and
M2. Far classifier M1, the dataset is divided into trnining and validation
sets of 100 samples each and the clnssifier is trained on the training set. The
performance of M1 on this validntion set provides n 80% nccurncy. Far dnssifier
M2, the dntnset is divided into a training set of 150 samples nnd a validation set
of 50 samples, and the clasifier is trained on the training set. The performnnce
of M2 on the corresponding validntion set provides and accuracy of 90%. Is
clnsifer M2 to be preferred to classifier M1? Justify your nnswer.Provide your answer and a concise explanntion far each of the following questions.
(a) A k-mesans nlgorithm with K=2 is employed to cluster the following 52
data matrix:
x=[13242-1231-2]
The algorithm is initialized from the set of centroids 1=(1,1) and 2=
(3,2). After one iteration of the algorithm, is the k-means objective function
value equal to 25? Remember to justify your answer.
(b) A marketing researcher uses the k-menns algorithm to cluster data concerning
a sarmple of customers of a large npparel retailer. The researcher does not
have any prior information about the number of clusters K. The clustering
allocation for different values of K is compared to an external classification
of the subjects into different types of purchsaing behaviors. The table below
reports the adjusted Rand index (ARI) vnluss for each value of K considered:
Do you think that K=4 corresponds to the optimal number of clusters for
these data? Justify your nnswer.
(c) A financial institution implements a clasaification tree to classify applicants to
loans according to credit worthiness, "Good" or "Bad", with the main intent
of detecting applicennts with "Good" credit rating outbook. An excerpt of the
output of a classificntion tree implemented using the rpart function on a
sample of data is displayed below.
 nodo), upl1t, n,10ss, yval, (yprob) donotes terainal nodo root 27595 Good

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!