Question: import utils # noqa: F401, do not remove if using a Mac # add your imports BELOW this line def ones_and_tens_digit_histogram(numbers): '''
import utils # noqa: F401, do not remove if using a Mac # add your imports BELOW this line def ones_and_tens_digit_histogram(numbers): ''' Input: a list of numbers. Returns: a list where the value at index i is the frequency in which digit i appeared in the ones place OR the tens place in the input list. This returned list will always have 10 numbers (representing the frequency of digits 0 - 9). For example, given the input list [127, 426, 28, 9, 90] This function will return [0.2, 0.0, 0.3, 0.0, 0.0, 0.0, 0.1, 0.1, 0.1, 0.2] That is, the digit 0 occurred in 20% of the one and tens places; 2 in 30% of them; 6, 7, and 8 each in 10% of the ones and tens, and 9 occurred in 20% of the ones and tens. See fraud_detection_tests.py for additional cases. ''' histogram = [0] * 10 # first fill histogram with counts for i in numbers: # 1's place histogram[i % 10] += 1 # 10's place histogram[i // 10 % 10] += 1 # normalize over total counts for i in range(len(histogram)): histogram[i] /= len(numbers) * 2 return histogram # Your Set of Functions for this assignment goes in here # The code in this function is executed when this # file is run as a Python program def main(): # Code that calls functions you have written above # e.g. extract_election_vote_counts() etc. # This code should produce the output expected from your program. raise NotImplementedError("Delete this line and start writing code") if __name__ == "__main__": main() The file above is fraud_detection.py from sys import platform as sys_pf if sys_pf == 'darwin': import matplotlib matplotlib.use("TkAgg") The file above is utils.py import fraud_detection as fd import math def test_ones_and_tens_digit_histogram(): # Easy to calculate case: 5 numbers, clean percentages. actual = fd.ones_and_tens_digit_histogram([127, 426, 28, 9, 90]) expected = [0.2, 0.0, 0.3, 0.0, 0.0, 0.0, 0.1, 0.1, 0.1, 0.2] for i in range(len(actual)): assert math.isclose(actual[i], expected[i]) # Obscure and hard (by hand) to calculate frequencies input = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765] actual = fd.ones_and_tens_digit_histogram(input) expected = [0.21428571428571427, 0.14285714285714285, 0.047619047619047616, 0.11904761904761904, 0.09523809523809523, 0.09523809523809523, 0.023809523809523808, 0.09523809523809523, 0.11904761904761904, 0.047619047619047616] for i in range(len(actual)): assert math.isclose(actual[i], expected[i]) # write other test functions here def main(): test_ones_and_tens_digit_histogram() # call other test functions here if __name__ == "__main__": main() The file above is fraud_detection_tests.py Region,Ahmadinejad,% ,Rezai,%,Karrubi,%,Mousavi,%,Total votes,Invalid votes,Valid votes,Eligible voters,"Turnout, %" East Azerbaijan,"1,131,111",56.75,"16,920",0.85,"7,246",0.36,"837,858",42.04,"2,010,340","17,205","1,993,135","2,461,553",80.97 West Azerbaijan,"623,946",47.48,"12,199",0.93,"21,609",1.64,"656,508",49.95,"1,334,356","20,094","1,314,262","1,883,144",69.79 Ardabil,"325,911",51.11,"6,578",1.03,"2,319",0.36,"302,825",47.49,"642,005","4,372","637,633","804,881",79.22 Isfahan,"1,799,255",68.88,"51,788",1.98,"14,579",0.56,"746,697",28.58,"2,637,482","25,163","2,612,319","2,987,946",87.43 Ilam,"199,654",64.58,"5,221",1.69,"7,471",2.42,"96,826",31.32,"312,667","3,495","309,172","357,687",86.44 Bushehr,"299,357",61.37,"7,608",1.56,"3,563",0.73,"177,268",36.34,"493,989","6,193","487,796","580,822",83.98 Tehran,"3,819,495",51.57,"147,487",1.99,"67,334",0.91,"3,371,523",45.53,"7,521,540","115,701","7,405,839","8,796,466",84.19 Chahar Mahaal and Bakhtiari,"359,578",73.01,"22,689",4.61,"4,127",0.84,"106,099",21.54,"495,446","2,953","492,493","562,238",87.60 South Khorasan,"285,984",75.01,"3,962",1.04,928,0.24,"90,363",23.70,"383,157","1,920","381,237",, Khorasan Razavi,"2,214,801",70.14,"44,809",1.42,"13,561",0.43,"884,570",28.01,"3,181,990","24,249","3,157,741",, North Khorasan,"341,104",74.00,"4,129",0.90,"2,478",0.54,"113,218",24.56,"464,001","3,072","460,929",, Khuzestan,"1,303,129",64.81,"139,124",6.92,"15,934",0.79,"552,636",27.48,"2,038,845","28,022","2,010,823","2,801,644",71.77 Zanjan,"444,480",76.56,"7,276",1.25,"2,223",0.38,"126,561",21.80,"585,721","5,181","580,540","632,160",91.83 Semnan,"295,177",77.78,"4,440",1.17,"2,147",0.57,"77,754",20.49,"383,308","3,790","379,518","436,492",86.95 Sistan and Baluchestan,"450,269",46.07,"6,616",0.68,"12,504",1.28,"507,946",51.97,"982,920","5,585","977,335","1,306,624",74.80 Fars,"1,758,026",70.18,"23,871",0.95,"16,277",0.65,"706,764",28.21,"2,523,300","18,362","2,504,938","2,842,209",88.13 Qazvin,"498,061",72.57,"7,978",1.16,"2,690",0.39,"177,542",25.87,"692,355","6,084","686,271","749,205",91.60 Qom,"422,457",71.66,"16,297",2.76,"2,314",0.39,"148,467",25.18,"599,040","9,505","589,535","655,988",89.87 Kurdistan,"315,689",52.75,"7,140",1.19,"13,862",2.32,"261,772",43.74,"610,756","12,293","598,463","943,818",63.41 Kerman,"1,160,446",77.59,"12,016",0.80,"4,977",0.33,"318,250",21.28,"1,505,814","10,125","1,495,689","1,738,280",86.04 Kermanshah,"573,568",59.14,"11,258",1.16,"10,798",1.11,"374,188",38.58,"983,422","13,610","969,812","1,231,672",78.74 Kohgiluyeh & Boyer-Ahmad,"253,962",69.44,"8,542",2.34,"4,274",1.17,"98,937",27.05,"368,707","2,992","365,715","415,694",87.98 Golestan,"515,211",60.11,"5,987",0.70,"10,097",1.18,"325,806",38.01,"869,453","12,352","857,101","1,059,769",80.88 Gilan,"998,573",67.86,"12,022",0.82,"7,183",0.49,"453,806",30.84,"1,483,258","11,674","1,471,584","1,576,046",93.37 Lorestan,"677,829",70.91,"14,920",1.56,"44,036",4.61,"219,156",22.93,"964,270","8,329","955,941","1,124,940",84.98 Mazandaran,"1,289,257",67.70,"19,587",1.03,"10,050",0.53,"585,373",30.74,"1,919,838","15,571","1,904,267","1,915,240",99.43 Markazi,"572,988",73.64,"10,057",1.29,"4,675",0.60,"190,349",24.46,"785,961","7,892","778,069","885,557",87.86 Hormozgan,"482,990",65.50,"7,237",0.98,"5,126",0.70,"241,988",32.82,"743,024","5,683","737,341","919,908",80.15 Hamadan,"765,723",75.86,"13,117",1.30,"12,032",1.19,"218,481",21.65,"1,019,169","9,816","1,009,353","1,256,250",80.35 Yazd,"337,178",55.83,"8,406",1.39,"2,565",0.42,"255,799",42.35,"609,856","5,908","603,948","609,341",99.11 The file above is election-iran-2009.csv








3/5/23, 11:40 PM Programming Info Watch this video by a former TA () that gives a previous quarter's overview of the assignment. Note that Problems 3 through 7 in the video are actually Problems 2 through 6 below. One way to determine fraud in election results is to examine the least significant digits of the vote totals the ones place and the tens place. The ones place and the tens place don't affect who wins. They are essentially random noise, in the sense that in any real election, each value is equally likely. Another way to say this is that we expect the ones and tens digits to be uniformly distributed that is, 10% of the digits should be "0", 10% should be "1", and so forth. If these digits are not uniformly distributed, then it is likely that the numbers were made up by a person rather than collected from ballot boxes (people tend to be poor at making truly random numbers.) It is important to note that a non-uniform distribution does not necessarily mean that the data is fraudulent data. A non-uniform distribution is a great signal for fraudulent data, but it is possible for a non-uniform distribution to appear naturally. Warning DO NOT modify the function names, parameters, or output of the functions. Doing so will result in your assignment being graded incorrectly. You will complete this assignment by writing several functions and write a main function that will be responsible for calling the other functions: Programming - CSE 160 extract_election_votes (filename, column_names) plot_iran_least_digits_histogram (histogram) plot_dist_by_sample_size() mean_squared_error(numbers1, numbers2) calculate_mse_with_uniform(histogram) compare_iran_mse_to_samples (iran_mse, number_of_iran_datapoints) Tip Writing additional functions and tests will be helpful in completing this assignment. In fact, having additional tests is required. Download the starter code. Then, extract (unzip) the contents anywhere on your computer. Take a look through the files. Here are some things to note: Inside of fraud_detection.py, the call to main is on the last line in the file. Your program should not execute any code, other than the main function, when it is loaded all code from now on should be inside a function, never at the global level. Everything you want to happen when you run this file should go inside of main, including assert statements. Inside of fraud_detection_tests.py, you'll only see one test function here. However, you are expected to create at least two additional tests to test the functionality of your code as you work through the assignment. Testing Tips We have not provided tests or exact results to check your program against. We encourage you to write your own tests and to use assert statements. o Refer to the starter code given in fraud_detection_test.py for an example of organizing your test functions. We HIGHLY encourage you to write tests before writing the functions. Doing so will allow you to check your work as you write your functions and ensure that you catch more bugs in your code. You do not need to test functions that generate plots or print output. Additionally, you do not need to create extra data files for testing, although you are welcome to do so to improve your test suite. However, since you cannot turn in those extra data files, you should comment those tests out in your final submission. To compare two floating point numbers (e.g., 3.1415 and 2.71828), use math.isclose() instead of == https://courses.cs.washington.edu/courses/cse160/23wi/homework/a6/programming/ 1) Problem 1: Read and clean Iranian election data 2) Problem 2: Plot election data 3) Problem 3: Smaller samples have more variation 4) Problem 4: Comparing variation of samples 4.1) Statistics background 5) Problem 5: Comparing variation of samples 5.1) Part 1 5.2) Part 2 6) Problem 6: Interpret your results 6.1) Example 1 6.2) Example 2 6.3) Example 3 6.4) Example 4 7) Quality 7.1) Code Tests 8) Submission 1/9
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
