Question: Please help to solve this problem. For coding part please use JAVA or Python In this problem the Iris data set will be used to

Please help to solve this problem. For coding part please use JAVA or Python

In this problem the Iris data set will be used to begin understanding how to apply the algorithms in the first four modules to a well know data set. The Iris Plants Database contains 3 classes of 50 instances each, where each class refers to a type of Iris plant. Four attributes/features (in centimeters) were collected for each plant instance. A fifth attribute is provided which is the class label of the plant type. The data can be downloaded from iris.arff on the Sample Weka Data Sets webpage (https://storm.cis.fordham.edu/ gweiss/data-mining/datasets.html).

Please help to solve this problem. For coding part please use JAVA

or Python In this problem the Iris data set will be used

4. Outlier Removal (25 points) (a) Develop an algorithm (pseudocode) to remove in sequential order observations that are furthest from the data class mean. (b) Provide the running time and total running time of your algorithm in O-notation and T(n). (c) Implement your algorithm in your code of choice. (d) Determine if the data contains an outlier by plotting each class individually, the key is to plot two features at a time n different combinations, e.g., feature 1 vs feature 2, etc. (e) Provide an explanation of the results: i. was there any class that had obvious outliers; if so how did you determine the outlier, if not, why not? 1 ii. what was the metric used to determine separation? Explain why the metric was chosen

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!