Question: For the purposes of this assignment you will develop two different classification predictive models to classify different types of dry beans. You have to write

For the purposes of this assignment you will develop two different classification predictive models to classify

different types of dry beans. You have to write a report wherein you provide responses in clear narrative on the

aspects enumerated below, under appropriate section headings. Note that code will not be evaluated. Tables

and figures will also not be considered if these tables and figures are not accompanied by your own explanation

of what these tables and figures portray.

Complete the assignment in the following steps:

1 .

Download the DryBeanDataSet

774 .

xlsx dataset. The dataset contains

13611

instances,

20

descriptive

features, and the class feature Class in column U

.

2 .

Without changing anything in the provided dataset, provide an analytics base table wherein you characterize all of the features of the dataset.

(10)

3 .

You now have to very carefully explore the dataset to identify data quality issues. For this part of your

report, only identify the data quality issues and provide justifications for these issues.

(30)

4 .

Based on your analysis above, decide on two different machine learning approaches that you will employ

to construct a predictive model for this problem. Give justifications for why you have selected these two

approaches for this problem.

(20)

5 .

For each of the machine learning approaches, discuss the data

-

preprocessing steps that you have implemented to optimally transform the dataset for that specific machine learning approach and to correct

data quality issues. Note: do not do unnecessary data transformations. Carefully think about the data

transformations needed for your selected machine learning algorithms. Provide justifications for each of

these pre

-

processing steps. Should you decide not to address a data quality issue, justify this decision.

When you pre

-

process the dataset, make sure that you do not change the order of the instances in the

dataset.

(80)

6 .

Develop the two predictive models and evaluate the performance of the two models. Make sure to construct

optimal configurations of your chosen models both with respect to architecture and values for control

parameters. Describe the process that you have followed to produce an optimal configuration for each

model. For this purpose, carefully decide on the performance metrics that you will use. Conclude on which

one of the two approaches is best for this problem, and support your conclusion with justifications. For

the purposes of this assignment, make sure to report the performance based on a k

-

fold cross

-

validation.

Decide on the number of folds with a justification

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Classification of Dry Beans For the purposes of this assignment you will develop two different classification predictive models to classify different types of dry beans. You have to write a report...

For the purposes of this assignment you will develop two different classification predictive models to classify different types of dry beans. You have to write a report wherein you provide responses...

For the purposes of this assignment you will develop both a bagging and boosting ensemble learning model of your choice to produce two dry beans classication models. You will then compare the...

For the purposes of this assignment you will develop both a bagging and boosting ensemble learning model of your choice to produce two dry beans classication models using python. You will then...

Can you please answer this questions below QUESTION 1 The key to any successful report is clarity, ____, completeness, and correctness. a. actuality b. longevity c. brevity 2 points QUESTION 2 A is a...

Suppose that you are vice-president of branch services at the Credit Union of Kelowna. You notice that several branches have consistently low customer service ratings even though there are no...

The figure shows a fixed circle C1 with equation (x 1)2 + y2 = 1 and a shrinking circle C2 with radius and center the origin. P is the point (0, r), Q is the upper point of intersection of the two...

~ \ anaconda 3 \ lib \ site - packages \ tensorflow \ python \ training \ py _ checkpoint _ reader.py in 1 7 from tensorflow.python.framework import errors _ impl 1 8 from tensorflow.python.util...

CT Corp Comprehensive Question Canadian Tire Corporation, Limited ( Canadian Tire ) is a family of companies that includes a retail segment and a financial services division, among others. The retail...

3-11 In this exercise, you will use software at car Web sites to find product information about a car of your choice and use that information to make an important purchase decision. You will also...

3-12 In MyMISLab, you will find a Collaboration and Teamwork Project dealing with the concepts in this chapter. You will be able to use Google Drive, Google Docs, Google Sites, Google+, or other...

3-8 Marks & Spencer Group is a leading department store chain in the United Kingdom. Its retail stores sell a range of merchandise. Senior management has decided that Marks & Spencer should tailor...