Question: 1. Consider the following hash function. int hashFunc(int x) { return (x * 2) % HASH_SIZE; } Assume HASH_SIZE = 10. Here is the hash

1. Consider the following hash function. int hashFunc(int x) { return (x * 2) % HASH_SIZE; } Assume HASH_SIZE = 10. Here is the hash tables insert function: void insert(int key) { int index = hashFunc(key); hash_array[index].push_back(key); } where hash_array is an array of lists.

2. Draw the state of hash_array on the right side after the following calls. Assume this is a chaining hash table (each element in the array is a linked list of records). The first two elements are drawn in there for you.

insert(7); insert(1); insert(23); insert(14); insert(19); insert(53); insert(37); insert(83); Is hashFunc() a good hash function? Why or why not?

3. You are hired to design a website called brutionary.com, a dictionary.com variant. You are given a list of dictionary words (and their definitions) in a text file, and would like to preprocess it, such that you can readily provide information to users who visit your website and look up words. Assume that your dictionary is not going to be updated once it is preprocessed. We still want your system to scale -- that is, it should be able to efficiently take care of a lot of queries that may come in. Assume a Word structure like the following is used to store each word and definition. struct Word { string word; string definition; }; (a) First of all, the users should be able to look up words. Which option would you take, and why? [Option A] Store Words in a binary search tree. [Option B] Store Words in a hash table. (b) You want to add a functionality that prints all words within a specified range (e.g., all words between abstain and abstract). Which option would you take, and why? [Option A] Add a sorted linked list with pointers to Words in the structure used in (a). Each time a range [x:y] is specified, we search x in the list, and then traverse the list to print each word, until we hit y. [Option B] Add a sorted vector with pointers to Words in the structure used in (a). Each time a range [x:y] is specified, we search x in the vector, and then traverse the vector to print each word, until we hit y. (c) In the right corner of the website, you want to display k most popular words, where k is some integer, which are determined by queries that were received in the past hour, and is updated every hour. Assume queries are made on n distinct words, where n >> k. Which option would you take, and why? [Option A] In the beginning of every hour, create a hash table (initially empty) that stores (word, count)-pairs. Each time a query for a word x comes in, look up x in the hash table (or add a new one if one does not exist), and increase the count for x. At the end of the hour, iterate through all (word, count)-pairs in the hash table and store them in a maxheap, using their counts as the keys. Then extract k words from the heap. [Option B] In the beginning of every hour, create a vector (initially empty) that stores (word, count)- pairs. Each time a query for a word x comes in, look up x in the vector (or add a new one if one does not exist), and increase the count for x. At the end of the hour, sort the pairs in the vector in the decreasing order of their counts, and print the first k words. Note: There may, of course, be a better design choice to each problem than ones presented above. Feel free to discuss other possibilities with your peers.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!