Many statistical tests for outliers were developed in an environment in which a few hundred observations was

Question:

Many statistical tests for outliers were developed in an environment in which a few hundred observations was a large data set. We explore the limitations of such approaches.
(a) For a set of 1,000,000 values, how likely are we to have outliers according to the test that says a value is an outlier if it is more than three standard deviations from the average? (Assume a normal distribution.)
(b) Does the approach that states an outlier is an object of unusually low probability need to be adjusted when dealing with large data sets? If so, how?
Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Introduction to Data Mining

ISBN: 978-0321321367

1st edition

Authors: Pang Ning Tan, Michael Steinbach, Vipin Kumar

Question Posted: