Question: Exercise 27.2 Assume you are given a document database that contains SIX documents. After stemming, the documents contain the following terms: Document Terms 1 car

 Exercise 27.2 Assume you are given a document database that contains

Exercise 27.2 Assume you are given a document database that contains SIX documents. After stemming, the documents contain the following terms:

Document Terms

1 car manufacturer Honda auto

2 Auto computer navigation

3 Honda navigation

4 Manufacturer computer IBM

5 IBM personal computer

Car beetle VW

Answer the following questions.

1. Show the result of creating an inverted file on the documents.

2. Show the result of creating a signature file with a width of 5 bits. Construct your own hashing function that maps terms to bit positions.

3. Evaluate the following boolean queries using the inverted file and the signature file that you created: 'car', 'IBM' AND 'computer', IBM, AND , COMPUTER, IMB AND car, IBM OR auto, and IBM AND Computer AND manufacturer.

4. Assume that the query load against the documents database consists of exactly the queries that were stated in the previous question. Also assume that each of these queries is evaluated exactly OIlCC.

(a) Design a signature file with a width of :3 bits and design a hashing function that minimizes the overall number of false positives retrieved when evaluating the

(b) Design a signature file with a width of 6 bits and a hashing function that minimizes the overall number of false positives.

(c) Assume you want to construct a signature file. What is the smallest signature width that allows you to evaluate all queries without retrieving any false positives?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!