Question: PLEASE DO T1 Task Description: This data task is about creating and evaluating predictive models for heart failure prediction. The dataset to be used for
PLEASE DO T1
Task Description: This data task is about creating and evaluating predictive models for heart failure prediction. The dataset to be used for this task consists of 13 columns (i.e. features) related to cardiovascular diseases from 299 patients. The data file is in .csv format and more detailed information can be found here: https://www.kaggle.com/andrewmvd/heart-failure-clinical- data.
This assignment must be completed using WEKA. If you would like to use additional scripts or programming to support the completion of some of the tasks you are free to do so, but the programming language is limited to Java. Besides, we only have resources for technical support on WEKA in this module.
Read the data description carefully, and perform the following 7 tasks (T1 to T7). Please note that, T5 is an individual task that needs to be conducted by each member of the group. The rest of the tasks are group work.
Please also note that the WEKA filter numericToBinary may not work on the output column in this dataset. If you encounter this issue, please use the filter numericToNominal instead.
T1. Reformat the downloaded data file into an .arff file that can be read by WEKA. Perform an initial analysis of the datasets size, attribute types and distributions. Explain your observations about the dataset. (group, 5%)
T2. Formulate one learning task that you would like to solve with this dataset by explaining what the input attributes and output to be learnt are. (group, 5%)
T3. After the learning task is clearly defined, check if the dataset is ready to be learnt. For example, do all of the input attributes contribute to the output that you are learning? Some simple preprocessing techniques may improve data quality and lead to better learning performance of the models later on. Explain any preprocessing that you perform on the dataset. (group, 10%)
T4. Set up the experiment to solve your problem. This includes:
-
Provide the list of supervised learning algorithms that your group chooses to solve
the problem. Each student should be responsible for running the experiments with a different supervised learning algorithm. For instance, if your group has 6 students, you should list 6 different algorithms here. See Task 5 for details. Algorithm selection is not limited to what you have learnt in the lectures. If any of you decide to use an algorithm that is not covered in the lectures, please provide justification on why you believe it is suitable to the problem.
-
Choose the model evaluation technique that is appropriate to evaluate the chosen learning algorithms for this problem, including the model evaluation strategy that will be used for tuning hyper-parameters. Justify your choice.
(group, 20%)
T5. Each group member needs to build a predictive model using WEKA by using one of the chosen algorithms listed in T4. Save the best performing model that you find after hyper- parameter tuning, and record the performance of the model. Explain what hyper-parameters were tuned, which hyper-parameter values were considered during the tuning process, and which values led to the best performing model. Explain why you believe you obtain such results, by building on your knowledge of how the algorithm works. (individual, 30%)
T6. Compare and discuss the results obtained by the different models in terms of the models performance. Explain potential reasons why some algorithms perform better than the others. Present the comparison results in an easily understandable way. (group, 20%)
T7. Produce the group project report, explaining T1 to T6. For all tasks that involve WEKA, please add the WEKA screenshots of performing those tasks in the Appendix of the report. The report will be evaluated based on clarity and completeness. (group, 10%)
Mandatory Report Structure: Dataset description and reading in Weka (T1) The learning task (T2)
Data preprocessing (T3) Experimental Design (T4) Machine learning model A (T5): model training, hyper-parameter tuning and testing Machine learning model B (T5): model training, hyper-parameter tuning and testing Machine learning model C (T5): model training, hyper-parameter tuning and testing ... Result comparison and discussion (T6) Conclusion and recommendations (T6) Appendix
Important Note:
-
Your group for this assignment will remain the same as the group you formed for the
Week 5 formative assignment.
-
The final submission should include two items per group: one report in pdf format and
one zip file. The zip file should include the "arff" data file, the individual models and programming scripts (if there is any). Only 1 group member needs to submit it on canvas (in the "Assignments" section of the module).
-
The report is limited to two A4 pages (excluding the cover page and the appendix) using Arial font 11pt, single spaced, and with 2cm margins. The appendix should include the screenshots of WEKA. Please annotate the screenshots with the corresponding task number. Any tables, plots or other visual aids produced to support your discussions can also be placed in the appendix.
-
The assignment will be marked based on how well each of the tasks above have been addressed and on the existence of supporting evidence provided in the form of individual models, programming scripts (if any) and appendix.
-
Note that WEKA uses different names for some of the algorithms learned in this course. In particular, k-Nearest Neighbours is called IBk, decision trees are J48, and logistic regression is Logistic. At times, these algorithms will present additional hyperparameters to the ones learned in class, as some of these algorithms are extended versions of the algorithms learned in class. Please feel free to explore these algorithms using different values for any of the hyperparameters that are available in WEKA as part of this assignment.
-
In the unlikely event that the majority of your group feels that the contribution of one or more of the other members does not reach the level of what you expect, you have the option to fill in the following form:
Firm Deadline: 9am (UK time), 15th of March (Monday), 2021. Extensions will not be allowed except for cases approved by welfare.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
