Question: 2. Consider the following three documents: Glimpse is an indexing and query system that allows for search through a file system or dacument collection quickly.

 2. Consider the following three documents: Glimpse is an indexing and

2. Consider the following three documents: Glimpse is an indexing and query system that allows for search through a file system or dacument collection quickly. Glimpse is the default search engine in a larger information retrieval system. It has also been used as part of some web based search engines The main processes in a retrieval system are document indexing, query processing, query evaluation and relevance feedback. Among these, efficient updating of the index is critical in large scale systems. Clusters are created from short snippets of documents retrieved by web search engines which are as good as clusters created from the full text of web documents. (a) Remove stop words and punctuation, and apply Porter's stemming algorithm to the three documents (use the online stemming application for this purpose to save your time, e.g., https:/l9oles/porter is demo.html or httpsilltext-processing.com/demolstem/ or http:Iftextanalysisonline.comltk-porter-stemmer; Note; the scripts only stem the documents, you need to remove the stop words afterwards) (b) Create an inverted index of the three documents, including the dictionary and the postings. The dictionary should also contain (for each term) statistics such as total number of occurrences in the collection and the document frequency. The postings for each term should contain the document ids and the term frequencies (depict multiple postings for a term as a linked list, similar to Figure 1.3 in the IR Book) (c) What are the search results for the following Boolean queries (in each case explain how you obtained them from the inverted index)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!