Question: For the purposes of this assignment you will develop two different classification predictive models to classify different types of dry beans. You have to write
For the purposes of this assignment you will develop two different classification predictive models to classify
different types of dry beans. You have to write a report wherein you provide responses in clear narrative on the
aspects enumerated below, under appropriate section headings. Note that code will not be evaluated. Tables
and figures will also not be considered if these tables and figures are not accompanied by your own explanation
of what these tables and figures portray.
Complete the assignment in the following steps:
Download the DryBeanDataSetxlsx dataset. The dataset contains instances, descriptive
features, and the class feature Class in column U
Without changing anything in the provided dataset, provide an analytics base table wherein you characterize all of the features of the dataset.
You now have to very carefully explore the dataset to identify data quality issues. For this part of your
report, only identify the data quality issues and provide justifications for these issues.
Based on your analysis above, decide on two different machine learning approaches that you will employ
to construct a predictive model for this problem. Give justifications for why you have selected these two
approaches for this problem.
For each of the machine learning approaches, discuss the datapreprocessing steps that you have implemented to optimally transform the dataset for that specific machine learning approach and to correct
data quality issues. Note: do not do unnecessary data transformations. Carefully think about the data
transformations needed for your selected machine learning algorithms. Provide justifications for each of
these preprocessing steps. Should you decide not to address a data quality issue, justify this decision.
When you preprocess the dataset, make sure that you do not change the order of the instances in the
dataset.
Develop the two predictive models and evaluate the performance of the two models. Make sure to construct
optimal configurations of your chosen models both with respect to architecture and values for control
parameters. Describe the process that you have followed to produce an optimal configuration for each
model. For this purpose, carefully decide on the performance metrics that you will use. Conclude on which
one of the two approaches is best for this problem, and support your conclusion with justifications. For
the purposes of this assignment, make sure to report the performance based on a kfold crossvalidation.
Decide on the number of folds with a justification
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
