Question: Given the following FIT and TEST data sets: FIT data set: = = = = = = = = = = = = = Program

Given the following FIT and TEST data sets:
FIT data set:
=============
Program Modules Actual no. of Fault-Prone (FP) LOC
Faults (Y) Not Fault-Prone (NFP) Indep. Variable (X)
---------------------------------------------------------------
A 0 NFP 19
B 1 NFP 25
C 0 NFP 20
D 2 NFP 28
E 1 NFP 21
F 3 NFP 30
G 2 NFP 36
H 5 NFP 38
I 7 FP 41
J 11 FP 45
K 16 FP 50
L 21 FP 60
TEST data set:
=============
M ??42
N ??37
Where (?) means that we do NOT know its value and have to predict/estimate.
(a) Use CBR with Euclidean Distance as similarity function and un-weighted
average of THREE most similar cases to predict the number of faults in modules M
and N. SHOW ALL YOUR WORK!
(b) Use CBR with Euclidean Distance as similarity function and Majority Voting
method with THREE most similar cases to classify modules M and N as FP or NFP
when C =2.25. SHOW ALL YOUR WORK!
(c) Repeat part (b) with Data Clustering method. SHOW ALL YOUR WORK!

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!