Question: Here are the instructions: Read in data from anomaly_detection.txt and assign the data to an array (x). Create a function anomaly_detection() to take an array
Here are the instructions:
- Read in data from anomaly_detection.txt and assign the data to an array (x).
- Create a function anomaly_detection() to take an array as an input and output the result in the format of "sample_output" file.
Anomaly: Assume D is a dataset. X is a member of D. Mu is the mean of D without X, and STD is the standard deviation of D without X. If the difference of X and Mu is larger than 3xSTD, we say X is an anomaly. We then remove X from D. We will iteratively search D for anomaly until no more outliers are found.
I was able to print the data set and sort it, but I am stuck on what to do after. Attached is the jupyter notebook set I have been able to run so far, the sample output, and the dataset. I need help with the rest.



Jupyter Untitled Last Checkpoint: an hour ago (autosaved) Logout File Edit View Insert Cell Kernel Help Trusted Python 3 + Run Code In [25] : #Python program to find anomaly numbers in an array import statistics as s #Initialize list to store values x = 0 #Open file to read data into array/list x fin = open("/home/osboxes/Downloads/anomaly detection.txt","r") #Read the file into list for eachNum in fin: X.append(float(eachNum)) print(x) [99.5697438, 94.47019021, 55.0, 106.86672855, 102.78730151, 131.85777845, 88.25376895, 96.94439838, 83.67782174, 11 5.57993209, 118.97651966, 94.40479467, 79.63342207, 77.88602065, 96.59145004, 99.50145353, 97.25980235, 87.7201006 9, 101.30597215, 87.3110369, 110.0687946, 104.71504012, 89.34719772, 160.0, 110.61519268, 112.94716398, 104.4186758 6] In [38]: x. sort() print(x) [77.88602065, 79.63342207, 83.67782174, 87.3110369, 87.72010069, 88.25376895, 89.34719772, 94.46479467, 94.4701902 1, 96.59145004, 96.94439838, 97.25980235, 99.50145353, 99.5697438, 101.30597215, 102.78730151, 104.41867586, 104.71 504012, 106.86672859 110.0687946, 110.61519268, 12.94716398, 15.57993209, 118.97651966) In [ ]: anomaly_detection(x) Remove 160.00 from the list because it's 4.19 times of standard deviation of the list without it. 160.00 is removed from the list! Remove 55.00 from the list because it's 3.61 times of standard deviation of the list without it. 55.00 is removed from the list! Remove 131.86 from the list because it's 3.10 times of standard deviation of the list without it. 131.86 is removed from the list! No more anomaly is detected! jupyter anomaly_detection.txty Yesterday at 9:24 PM Language File Edit View 1 99.5697438 2 94.47019021 3 55.0 4 106.86672855 5 102.78730151 6 131.85777845 7 88.25376895 8 96.94439838 9 83.67782174 10 115.57993209 11 118.97651966 12 94.40479467 13 79.63342207 14 77.88602065 15 96.59145004 16 99.50145353 17 97.25980235 18 87.72010069 19 101.30597215 20 87.3110369 21 110.0687946 22 104.71504012 23 89.34719772 24 160.0 25 110.61519268 26 112.94716398 27 104.41867586 28
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
