Question: Note in Python Please Task: we are going to test various hash functions to see how good they are, in terms of how many collisions

Note in Python Please

Task: we are going to test various hash functions to see how good they are, in terms of how many collisions they have. Your input will be strings. The dataset is here. . This file contains just under 100,000 English words, which were going to use to test the uniformity of various hash functions. Your hash functions will hash strings into 16-bit (not 32-bit) ints. This is important, because were going to keep a table of the number of collisions for each hash value. For each of the possible hash functions your program should: Create hashes of size 65,536 Process the list of words, and for each word, compute its hash h Increment the entry in the table for that hash When finished, use Pearsons test to determine the probability that the resulting distribution is uniformly distributed. (See below for the deets.) Hash functions to test The hash (non)functions you should test are: String length (modulo 216216) First character Additive checksum (add all characters together), modulo 216216 Remainder (use a modulo of 65413, this is the first prime that is smaller than the table size). Remember that you cannot just add up all the characters and then take the mod of the result; you have to thread the modulo through the loop that computes the sum. Multiplicative (using the scheme described in class/in the lecture notes). Again, remember that you cant just use the final sum; you have to incorporate the multiplicative calculation into hashing loop. print out nice-looking histograms of each hash functions distribution.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!