Question: Computer Science Data Mining Machine Learning Given the following FIT and Test data sets: FIT DATA SET: Program Modules Actual no of Faults (Y) 0

Computer Science Data Mining Machine Learning

Computer Science Data Mining Machine Learning Given the following FIT and Test

Given the following FIT and Test data sets: FIT DATA SET: Program Modules Actual no of Faults (Y) 0 A B 1 0 2 Fault-Prone (FP) Not Fault-Prone (NFP) NEP NEP NEP NEP NEP NEP NEP FP FP LOC Indep. Variable (X) 23 25 21 28 22. 30 29 35 D E 1 F 3 G H 2 5 7 10 I J 40 55 FP TEST DATA SET: L ? ? ? ? ? ? 20 50 M 38 Where (?) means that we do NOT know its value and have to predict/estimate. A) Use CBR with Euclidean Distance as similarity function and unweighted average of TWO most similar cases to predict the number of faults in modules K, L, and M SHOW ALL YOUR WORK! B) Use CBR with Euclidean Distance as similarity function and Data Clustering method with TWO most similar cases (for each cluster) to classify modules K,L, and Mas FP or NFP when C= 0.50. SHOW ALL YOUR WORK! Given the following FIT and Test data sets: FIT DATA SET: Program Modules Actual no of Faults (Y) 0 A B 1 0 2 Fault-Prone (FP) Not Fault-Prone (NFP) NEP NEP NEP NEP NEP NEP NEP FP FP LOC Indep. Variable (X) 23 25 21 28 22. 30 29 35 D E 1 F 3 G H 2 5 7 10 I J 40 55 FP TEST DATA SET: L ? ? ? ? ? ? 20 50 M 38 Where (?) means that we do NOT know its value and have to predict/estimate. A) Use CBR with Euclidean Distance as similarity function and unweighted average of TWO most similar cases to predict the number of faults in modules K, L, and M SHOW ALL YOUR WORK! B) Use CBR with Euclidean Distance as similarity function and Data Clustering method with TWO most similar cases (for each cluster) to classify modules K,L, and Mas FP or NFP when C= 0.50. SHOW ALL YOUR WORK

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

. 5. Given the following FIT and TEST data seta: FIT data set: Program Modules Actual no. of Fault-Prone (FP Fault-prone Faults Y Not (NFF) indep. var x NFP NEP NFF NFP NEP NEP NEP NEP FP FP 23 25 28...

Given the following FIT and TEST data sets: FIT data set: = = = = = = = = = = = = = Program Modules Actual no . of Fault - Prone ( FP ) LOC Faults ( Y ) Not Fault - Prone ( NFP ) Indep. Variable ( X...

Computer Science Data Mining Machine Learning 4. Suppose we have used a decision tree algorithm to build a classifi- cation tree model for detection of Fault-Prone (FP) and Not Fault- Prone (NFP)...

Predictions are: M5: === Cross-validation === === Summary === Correlation coefficient 0.7935 Mean absolute error 1.7017 Root mean squared error 2.8612 Relative absolute error 58.8734 % Root relative...

post DataXu: Selling Ad Tech On June 20, 2016, DataXu CEO Mike Baker surveyed the beachfront at the Cannes Lions International Festival of Creativity. Each year, Cannes, a seaside town on the French...

Dissertation Topic: "The Effects of Cybersecurity Measures on the Productivity and Well-being of Teleworkers in the Healthcare Industry". Introduction Draft no more than TWO paragraphs here -...

(a) Describe how the CPU is allocated to processes if static priority scheduling is used. Be sure to consider the various possibilities available in the case of a tie. (b) "All scheduling algorithms...

D O NO T Ta KE IF YOU CANNOT ANSWEAR ALL THE QUESTION AS SUPPOSED OR I WILL RATE UNHELPFUL AND REPORT FOR PLAG Peru Domestic: a bond is issued in the US Global: a bond is issued in the US and foreign...

Yego Domestic: a bond is issued in the US Global: a bond is issued in the US and foreign markets Eurobonds: a bond denominated in USD is issued in the foreign market. You need to estimate the...

Given an accounts receivable turnover of 7.1 and annual credit sales of $750087, CALCULATE the average collection period is ___ days

pendent. 1. [1 0], [01]}. 3. ([2 -4], [-3 6]). 5. {].0]} 00 -180-106 BI (DOH) -000-0000 12. 2 7. 9. 11. 13. determine whether or not the given set is linearly inde- 2 2. ([11], [1 -1]}. 4. [13],...

The simplest way to describe the asset approach to valuation is: Group of answer choices Historical cost minus liabilities. Assets minus liabilities. Book value minus liabilities. Market value minus...

Liquidation as a strategy is applied in the case of: The firm seeks to achieve higher levels of production. The current market demands more of firm's products or services. The company needs some cash...

After Column and Data Types have been set in the Mining Structure, what is the next step?

What needs to occur after any Structural or Variable setting change in SSAS Mining Structures?

For the highest GS Pay Grade Group (1115) in the Federal Government, what are the chances of Females being included? Do Females have longer service statistics in that Group?