Part:1 Generate a set S of 500 points (vectors) in 3-dimensional Euclidean space. Use the Euclidean distance
Question:
Part:1 Generate a set S of 500 points (vectors) in 3-dimensional Euclidean space. Use the Euclidean distance to measure the distance between any two points. Write a program to find all the outliers in your set S and print out these outliers. If there is no outlier, your program should indicate so. Next, remove the outliers from S, and call the resulting set S’.
Part2: 1) Write a program that implements the K-means clustering algorithm and a program that implements the hierarchical agglomerative clustering algorithm taught in the class to cluster the points in S’ into k clusters. Note that k is a user-specified parameter value. Compare the clustering results obtained from the two algorithms; indicate and explain which clustering result/algorithm is better in terms of the Silhouette coefficient.
(2) Repeat part 1 and part2 (1) above on two additional different datasets.
Introduction to Data Mining
ISBN: 978-0321321367
1st edition
Authors: Pang Ning Tan, Michael Steinbach, Vipin Kumar