Consider the (relative distance) K-means scheme for outlier detection described in Section 10.5 and the accompanying figure,

Question:

Consider the (relative distance) K-means scheme for outlier detection described in Section 10.5 and the accompanying figure, Figure 10.10.
(a) The points at the bottom of the compact cluster shown in Figure 10.10 have a somewhat higher outlier score than those points at the top of the compact cluster. Why?
(b) Suppose that we choose the number of clusters to be much larger, e.g., 10. Would the proposed technique still be effective in finding the most extreme outlier at the top of the figure? Why or why not?
(c) The use of relative distance adjusts for differences in density. Give an example of where such an approach might lead to the wrong conclusion.
Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Introduction to Data Mining

ISBN: 978-0321321367

1st edition

Authors: Pang Ning Tan, Michael Steinbach, Vipin Kumar

Question Posted: