Question: i need answer step by step inWEKA Part I: Algorithm Comparison 1. In this lab you will investigate the difference in model performance using statistical

i need answer step by step inWEKA

Part I: Algorithm Comparison 1. In this lab you will investigate the difference in model performance using statistical significance testing. We will compare four models (decision tree J48, 3-Nearest-Neighbor and SVM) on two different data sets (diabetes. arff and breast-cancer, arff), and perform a pairwise comparison of the models on each data set (You can do a total of six paired experiments separately or run everything at the same time). 2. Choose 10 folds cross-validation as your experiment type and repeat 5 times on each pair. 3. For both data sets, compare the performance of all three algorithms using a paired t-test. For each model, describe parameter settings/design decisions you make in acquiring your data (so that your experiments are replicable). 4. You can collect accuracy estimates using the Experimenter in WEKA, dumping the results to a CSV file and using the appropriate column of the file. You will need to implement the paired permutation test yourself. 5. Does any one of the algorithms work significantly differently on either one of the two datasets from another algorithm? Report your findings. You should use sereenshots, calculations and analysis to support your conclusions. 6. For each pair of algorithms that you find to perform significantly differently, calculate the p-value of the paired t-test to support your finding. Part I: Algorithm Comparison 1. In this lab you will investigate the difference in model performance using statistical significance testing. We will compare four models (decision tree J48, 3-Nearest-Neighbor and SVM) on two different data sets (diabetes. arff and breast-cancer, arff), and perform a pairwise comparison of the models on each data set (You can do a total of six paired experiments separately or run everything at the same time). 2. Choose 10 folds cross-validation as your experiment type and repeat 5 times on each pair. 3. For both data sets, compare the performance of all three algorithms using a paired t-test. For each model, describe parameter settings/design decisions you make in acquiring your data (so that your experiments are replicable). 4. You can collect accuracy estimates using the Experimenter in WEKA, dumping the results to a CSV file and using the appropriate column of the file. You will need to implement the paired permutation test yourself. 5. Does any one of the algorithms work significantly differently on either one of the two datasets from another algorithm? Report your findings. You should use sereenshots, calculations and analysis to support your conclusions. 6. For each pair of algorithms that you find to perform significantly differently, calculate the p-value of the paired t-test to support your finding

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related General Management Questions!

S&P Enterprises has provided data from the first three months of the year. The Controller has asked you to prepare the Cash Budget and the related Schedules for Expected cash collections and Payments...

SUMMARY this journal, the length of it should not be more than 2 pages, with 1.5 spacing size 12 Times New Rome. Available online at www.sciencedirect.com Journal of Empirical Finance 15 (2008) 199 -...

ID Salary 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 65.1 27 36.1 61.4 46.6 73.3 39.7 23.2 74.6 22.8 23.4 62 40.7 23 24 51.1...

Dissertation Topic: "The Effects of Cybersecurity Measures on the Productivity and Well-being of Teleworkers in the Healthcare Industry". Introduction Draft no more than TWO paragraphs here -...

Could you please explain the findings of the study? A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models Evangelia...

UNIVERSITY OF BRADFORD WORKSHOP 2 2016 Module PH3004D Learning Outcomes: By the end of this workshop you will be able to: 1) Use SPSS to check the normal distribution of the data assumption 2) Edit...

Note: Summarize it with your own wording. Macroeconomic factors explaining stock volatility: multi-country empirical evidence from the auto industry Jana Vychytilov, Drahomra Pavelkov, Ha Pham & Tom...

Analytical procedures are a process consisting of four phases: expectation formation, identification, investigation, and evaluation. The most important phase is the first - expectation formation -...

Beststyle is a leading clothing brand with multiple retail stores across India. Their outlets are well spread in tier 1 and 2 cities of India. Moreover, within the tier 1 cities, they have maintained...

Assessment & Evaluation in Higher Education Vol. 29, No. 1, February 2004 Academic procrastination and statistics anxiety Anthony JOnwuegbuzieDepartment of Educational Measurement and Research,...

STATS mintab express lab, please and thank you in advance! (pictures uploaded a a odd order the fourth picture is the start) you can see the problem numbers 1-20 scattered To investigate the effect...

Companies are now taking the position that their charitable contributions should lead to something in return-for example, sales or increased visibility. Discuss the pros and cons of this position.

What is a hidden curriculum? Give an example of something that might be part of educations hidden curriculum?

a i) Explain what is meant by the term blockchaining. ii) Describe how blockchaining is used to prevent data tampering. b i) Describe what is meant by social engineering. ii) Describe two methods...

What was the voltage, and why was the voltage from the center tap (neutral) conductor to the "B" phase of a much higher value than the voltage from "A" phase and "C" phase to the center tap (neutral)