Question: Exercise 27.2 Assume you are given a document database that contains SIX documents. After stemming, the documents contain the following terms: Document Terms 1 car

Exercise 27.2 Assume you are given a document database that contains SIX documents. After stemming, the documents contain the following terms:
Document Terms
1 car manufacturer Honda auto
2 Auto computer navigation
3 Honda navigation
4 Manufacturer computer IBM
5 IBM personal computer
Car beetle VW
Answer the following questions.
1. Show the result of creating an inverted file on the documents.
2. Show the result of creating a signature file with a width of 5 bits. Construct your own hashing function that maps terms to bit positions.
3. Evaluate the following boolean queries using the inverted file and the signature file that you created: 'car', 'IBM' AND 'computer', IBM, AND , COMPUTER, IMB AND car, IBM OR auto, and IBM AND Computer AND manufacturer.
4. Assume that the query load against the documents database consists of exactly the queries that were stated in the previous question. Also assume that each of these queries is evaluated exactly OIlCC.
(a) Design a signature file with a width of :3 bits and design a hashing function that minimizes the overall number of false positives retrieved when evaluating the
(b) Design a signature file with a width of 6 bits and a hashing function that minimizes the overall number of false positives.
(c) Assume you want to construct a signature file. What is the smallest signature width that allows you to evaluate all queries without retrieving any false positives?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
