Question: Missing feature values need to be addressed prior to the model development phase of the CRISP - DM methodology to avoid training on incomplete data.
Missing feature values need to be addressed prior to the model development phase of the
CRISPDM methodology to avoid training on incomplete data. This task assesses your ability to
navigate the complexities of constructing a data pipeline to transform partially erroneous raw
data to knowledge and evaluate the effectiveness of the proposed solution. The task will require
you to identify a suitable publicly available datasets for which there is previous research that
addresses the missing feature value problem. Propose an approach to use the Nave Bayes
classifier to address the missing feature value problem in the context of categorical feature
values. Implement this approach on a publicly available datasets and report its performance in
the context of a classification problem. Compare the effectiveness of this approach against a
baseline imputation approach that uses the mode value, on two different machine learning
models. Justify and discuss your findings using appropriate metrics and relate your findings to
previous research on data imputation using the same datasets Present your findings in a
written report and a video presentation. State and motivate any assumptions or scope adopted
during the task
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
