PLEASE DO T1 Task Description This data task is about creating and evaluating predictive models for heart failure prediction The dataset to be used for this task consists of 13 columns (i e features) related to cardiovascular diseases from 299 patients The data file is in csv format and more detailed information can be found here https www kaggle com andrewmvd heart failure clinical data This assignment must be completed using WEKA If you would like to use additional scripts or programming to support the completion of some of the tasks you are free to do so, but the programming language is limited to Java Besides, we only have resources for technical support on WEKA in this module Read the data description carefully, and perform the following 7 tasks (T1 to T7) Please note that, T5 is an individual task that needs to be conducted by each member of the group The rest of the tasks are group work Please also note that the WEKA filter numericToBinary may not work on the output column in this dataset If you encounter this issue, please use the filter numericToNominal instead T1 Reformat the downloaded data file into an arff file that can be read by WEKA Perform an initial analysis of the datasets size, attribute types and distributions Explain your observations about the dataset (group, 5 ) T2 Formulate one learning task that you would like to solve with this dataset by explaining what the input attributes and output to be learnt are (group, 5 ) T3 After the learning task is clearly defined, check if the dataset is ready to be learnt For example, do all of the input attributes contribute to the output that you are learning Some simple preprocessing techniques may improve data quality and lead to better learning performance of the models later on Explain any preprocessing that you perform on the dataset (group, 10 ) T4 Set up the experiment to solve your problem This includes Provide the list of supervised learning algorithms that your group chooses to solve the problem Each student should be responsible for running the experiments with a different supervised learning algorithm For instance, if your group has 6 students, you should list 6 different algorithms here See Task 5 for details Algorithm selection is not limited to what you have learnt in the lectures If any of you decide to use an algorithm that is not covered in the lectures, please provide justification on why you believe it is suitable to the problem Choose the model evaluation technique that is appropriate to evaluate the chosen learning algorithms for this problem, including the model evaluation strategy that will be used for tuning hyper parameters Justify your choice (group, 20 ) T5 Each group member needs to build a predictive model using WEKA by using one of the chosen algorithms listed in T4 Save the best performing model that you find after hyper parameter tuning, and record the performance of the model Explain what hyper parameters were tuned, which hyper parameter values were considered during the tuning process, and which values led to the best performing model Explain why you believe you obtain such results, by building on your knowledge of how the algorithm works (individual, 30 ) T6 Compare and discuss the results obtained by the different models in terms of the models performance Explain potential reasons why some algorithms perform better than the others Present the comparison results in an easily understandable way (group, 20 ) T7 Produce the group project report, explaining T1 to T6 For all tasks that involve WEKA, please add the WEKA screenshots of performing those tasks in the Appendix of the report The report will be evaluated based on clarity and completeness (group, 10 ) Mandatory Report Structure Dataset description and reading in Weka (T1) The learning task (T2) Data preprocessing (T3) Experimental Design (T4) Machine learning model A (T5) model training, hyper parameter tuning and testing Machine learning model B (T5) model training, hyper parameter tuning and testing Machine learning model C (T5) model training, hyper parameter tuning and testing Result comparison and discussion (T6) Conclusion and recommendations (T6) Appendix Important Note Your group for this assignment will remain the same as the group you formed for the Week 5 formative assignment The final submission should include two items per group one report in pdf format and one zip file The zip file should include the arff data file, the individual models and programming scripts (if there is any) Only 1 group member needs to submit it on canvas (in the Assignments section of the module) The report is limited to two A4 pages (excluding the cover page and the appendix) using Arial font 11pt, single spaced, and with 2cm margins The appendix should include the screenshots of WEKA Please annotate the screenshots with the corresponding task number Any tables, plots or other visual aids produced to support your discussions can also be placed in the appendix The assignment will be marked based on how well each of the tasks above have been addressed and on the existence of supporting evidence provided in the form of individual models, programming scripts (if any) and appendix Note that WEKA uses different names for some of the algorithms learned in this course In particular, k Nearest Neighbours is called IBk, decision trees are J48, and logistic regression is Logistic At times, these algorithms will present additional hyperparameters to the ones learned in class, as some of these algorithms are extended versions of the algorithms learned in class Please feel free to explore these algorithms using different values for any of the hyperparameters that are available in WEKA as part of this assignment In the unlikely event that the majority of your group feels that the contribution of one or more of the other members does not reach the level of what you expect, you have the option to fill in the following form Firm Deadline 9am (UK time), 15th of March (Monday), 2021 Extensions will not be allowed except for cases approved by welfare

The Answer is in the image, click to view ...

Question: PLEASE DO T1 Task Description: This data task is about creating and evaluating predictive models for heart failure prediction. The dataset to be used for

PLEASE DO T1

Task Description: This data task is about creating and evaluating predictive models for heart failure prediction. The dataset to be used for this task consists of 13 columns (i.e. features) related to cardiovascular diseases from 299 patients. The data file is in .csv format and more detailed information can be found here: https://www.kaggle.com/andrewmvd/heart-failure-clinical- data.

This assignment must be completed using WEKA. If you would like to use additional scripts or programming to support the completion of some of the tasks you are free to do so, but the programming language is limited to Java. Besides, we only have resources for technical support on WEKA in this module.

Read the data description carefully, and perform the following 7 tasks (T1 to T7). Please note that, T5 is an individual task that needs to be conducted by each member of the group. The rest of the tasks are group work.

Please also note that the WEKA filter numericToBinary may not work on the output column in this dataset. If you encounter this issue, please use the filter numericToNominal instead.

T1. Reformat the downloaded data file into an .arff file that can be read by WEKA. Perform an initial analysis of the datasets size, attribute types and distributions. Explain your observations about the dataset. (group, 5%)

T2. Formulate one learning task that you would like to solve with this dataset by explaining what the input attributes and output to be learnt are. (group, 5%)

T3. After the learning task is clearly defined, check if the dataset is ready to be learnt. For example, do all of the input attributes contribute to the output that you are learning? Some simple preprocessing techniques may improve data quality and lead to better learning performance of the models later on. Explain any preprocessing that you perform on the dataset. (group, 10%)

T4. Set up the experiment to solve your problem. This includes:

Provide the list of supervised learning algorithms that your group chooses to solve

the problem. Each student should be responsible for running the experiments with a different supervised learning algorithm. For instance, if your group has 6 students, you should list 6 different algorithms here. See Task 5 for details. Algorithm selection is not limited to what you have learnt in the lectures. If any of you decide to use an algorithm that is not covered in the lectures, please provide justification on why you believe it is suitable to the problem.
Choose the model evaluation technique that is appropriate to evaluate the chosen learning algorithms for this problem, including the model evaluation strategy that will be used for tuning hyper-parameters. Justify your choice.

(group, 20%)

T5. Each group member needs to build a predictive model using WEKA by using one of the chosen algorithms listed in T4. Save the best performing model that you find after hyper- parameter tuning, and record the performance of the model. Explain what hyper-parameters were tuned, which hyper-parameter values were considered during the tuning process, and which values led to the best performing model. Explain why you believe you obtain such results, by building on your knowledge of how the algorithm works. (individual, 30%)

T6. Compare and discuss the results obtained by the different models in terms of the models performance. Explain potential reasons why some algorithms perform better than the others. Present the comparison results in an easily understandable way. (group, 20%)

T7. Produce the group project report, explaining T1 to T6. For all tasks that involve WEKA, please add the WEKA screenshots of performing those tasks in the Appendix of the report. The report will be evaluated based on clarity and completeness. (group, 10%)

Mandatory Report Structure: Dataset description and reading in Weka (T1) The learning task (T2)

Data preprocessing (T3) Experimental Design (T4) Machine learning model A (T5): model training, hyper-parameter tuning and testing Machine learning model B (T5): model training, hyper-parameter tuning and testing Machine learning model C (T5): model training, hyper-parameter tuning and testing ... Result comparison and discussion (T6) Conclusion and recommendations (T6) Appendix

Important Note:

Your group for this assignment will remain the same as the group you formed for the

Week 5 formative assignment.
The final submission should include two items per group: one report in pdf format and

one zip file. The zip file should include the "arff" data file, the individual models and programming scripts (if there is any). Only 1 group member needs to submit it on canvas (in the "Assignments" section of the module).
The report is limited to two A4 pages (excluding the cover page and the appendix) using Arial font 11pt, single spaced, and with 2cm margins. The appendix should include the screenshots of WEKA. Please annotate the screenshots with the corresponding task number. Any tables, plots or other visual aids produced to support your discussions can also be placed in the appendix.
The assignment will be marked based on how well each of the tasks above have been addressed and on the existence of supporting evidence provided in the form of individual models, programming scripts (if any) and appendix.
Note that WEKA uses different names for some of the algorithms learned in this course. In particular, k-Nearest Neighbours is called IBk, decision trees are J48, and logistic regression is Logistic. At times, these algorithms will present additional hyperparameters to the ones learned in class, as some of these algorithms are extended versions of the algorithms learned in class. Please feel free to explore these algorithms using different values for any of the hyperparameters that are available in WEKA as part of this assignment.
In the unlikely event that the majority of your group feels that the contribution of one or more of the other members does not reach the level of what you expect, you have the option to fill in the following form:

Firm Deadline: 9am (UK time), 15th of March (Monday), 2021. Extensions will not be allowed except for cases approved by welfare.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Abstract This article describes CRISP-DM (Cross-Industry Sandand Process for Data Mining), a non-proprietary, documented, and freely available data mining model. Dezeloped by indias- try leaders...

I need a 10 page paper for my MIS class. Please do not copy and paste as my school is getting stricter on plagiarism. I have attached the assignment and the sample \fData Analytic Thinking 1 Data...

Code the function greedy_predicator without using numpy/pandas Please include explanation of the code & the computational complexity To see the description of the function: Scroll down the...

ssignment you will work on datasets that are related with direct marketing campaigns (phone calls) of a Portuguese banking institution. The classification goal is to predict if the client will...

Al-Driven Contextual Advertising: Toward Relevant Messaging Without Personal Data E. Haglund and J. Bjorklund Department of Computing Science, Umea University, Umed, Sweden ABSTRACT In programmatic...

Please ,Carefully read the case study &Answer the following Discuss the significance of HRM practices in Pharmaceutical companies. How HRM practices helps in improving other industries. Please kindly...

Incidence and Types of Adverse Events and Negligent Care in Utah and Colorado Author(s): Eric J. Thomas, David M. Studdert, Helen R. Burstin, E. John Orav, Timothy Zeena, Elliott J. Williams, K....

DESIGN PROTOTYPES INC. PROJECT MANAGEMENT (B): PLANNING THE ALPHA C306 PROJECT Patricia A. Lapoint, McMurry University Carrol R. Haggard, Fort Hays State University CASE DESCRIPTION The primary...

I need a Gantt chart or similar timetable briefly outlining what & when you will be doing, and more description on the process you will be following to complete these tasks. E.g. Pre-degradation...

We all need water to survive. We often take it for granted in this country, we turn on a faucet and out it comes. After researching the links provided in our module discuss the issue, making sure to...

Jesus was an individual living in a particular time and a particular place that shaped and formed his perspective and interpretation of the world. It formed his message and mission. Based on your...

Copies of the W - 4 are normally be submitted to the IRS:a . annually with Form 9 4 0 b . with Form 9 4 1 c . with a transmittal Form W - 3 d . only upon request

Please make it fast a . 4 1 .

What tends to skew and distort Average Salaries in most Gender Pay Equity Studies?

The FedScope employment database has a number of Dimension Tables and a Single Fact Table, as shown in Table 7.1. Which columns/data elements in the Fact Table would be most useful in Pay Equity...

After Defining and Building a Multidimensional OLAP Cube, what is stored in the Cube?