Question: C++ The purpose of a String Table in a compiler is to store the spelling of identifiers and character (string) literals once, though they may

C++

The purpose of a String Table in a compiler is to store the spelling of identifiers and character (string) literals once, though they may appear several times in the source. The general idea is to convert a string into a single integer reference, so that the reference is used multiple times by the compiler, and the spelling can be queried from the String eliminates searching completely (ideally). A Hash Function is given the key of an object (usually a string) and returns an index into the array where the object with that key should be stored. The exact implementation of this function can be problem dependent (the hash function for a compiler may not be the best hash function for another project.) A Collision occurs when two or more objects with different keys hash to the same array index.

How can two or more objects occupy the same position in the array? One solution to this problem is to view each element of the array as a bucket.

A bucket is one element of an array that can hold multiple objects simultaneously.

A bucket is implemented as another data structure, often a dynamic linked list to conserve memory.

When this solution is used, the Hash Table is an array of pointers, where each element is a pointer to the head of a linked list. Example: - the objects being stored are simple strings (as in our compiler) - the Hash Table has 4 buckets [0..3], each a pointer to a node with a string - int hash(string item) { return item[0] % 4; } // ascii code of first char mod 4 - Insert: each of these words individually, in the order given: String ASCII Code of first char Hash Value The 84 0 only 111 3 winning 119 3 move 109 1 is 105 1 not 110 2 to 116 0 play 112 0 - Resulting Hash Table: Searching the Hash Table: The search should be as simple as getting the Hash Value of the key for which we are searching, then simply indexing the array: hashTable[hash(searchKey)] But you can see the problem: what if there is more than one value in the bucket? Then we must use a search (in our example, a Linear Search since a Linked List is linear) on the bucket. When there are many collisions, as in our example, using a Hash Table results in very little benefit over a simple array or linked list with plain Linear Search. Thus, for a Hash Table to be truly useful, collisions must be avoided! How can we reduce the number of collisions? The first answer is: more buckets! In fact, most Hash Tables contain many empty buckets, often more than half. But doesnt this waste space? Yes, but since each bucket is a pointer (and not a whole object), the space wasted is relatively small. BUT, adding more buckets is not enough. If we increase the number of buckets in our example to 1,000, but keep the same Hash Function, we gain nothing. So the second part of the answer is: write a better hash function to evenly distribute values across the table. For example: we know that the ASCII codes of all characters is between 0 and 127. So we could make 127 buckets, and use a Hash Function of: int hash(string item) { return item[0]; } // returns a value between 0 and 127 The more values a Hash Table is likely to store, the more buckets are needed, along with a better Hash Function to evenly distribute hashed keys to buckets. Collisions usually cannot be avoided, but they should be rare. The effectiveness of a Hash Table can be measured as apercentage:

Number of collisions / total number of entries

Implement a String Table as described in the notes, or one very similar that would.

C++ class interface: (StringTable.h) const int STRTBL_NUM_BUCKETS = 1000; // a node in a linked list struct StringTableEntry { std::string data; StringTableEntry* next = NULL; }; typedef StringTableEntry* StringTableRef; class StringTable { public: StringTable(); ~StringTable(); StringTableRef insert(string item); StringTableRef search(string searchName); string search(StringTableRef ref); void print(); void destruct(); private: StringTableRef bucket[STRTBL_NUM_BUCKETS]; int hash(string item); int hashVal; int numCollisions = 0; int numEntries = 0; };

Constructor: Initialize the buckets to NULL Destructor: just invoke destruct() Destruct(): deallocate all dynamic memory and set all buckets to NULL Insert(): Given: a new string to insert If the string is already in the table, return a reference to it (do not add it again). If not, add it to the String Table and return a reference to the new node/entry. Search(string): Return a reference to the node (entry) that matches the given searchName. Return NULL when not found. Search(ref): Return the data of the node pointed to by the given reference, or empty string when the reference is NULL. This is almost a needless function, except it allows any code using the class to not use pointers directly (well do the ref->data for them if they want). Hash(): Given a string, return the bucket number (index into the array). Write a good/clever hash function that distributes entries across the table to avoid collisions. 10% of the grade will depend on how low your collision percentage is (see print() below) simpleHash(string s) { int sum = 0; for (int i=0; i

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!