Question: Project Description The words query and topic are interchangeable. For this project, you are required to build the Query Processor. You will use the same
Project Description
The words queryand topicare interchangeable.
For this project, you are required to build the Query Processor.You will use the same dataset as in the previous two projects.
In this project, you will need to implement the query processing and retrieval portion of the search engine built on your Project implementation Your code should support the Vector Space Model and use Cosine Similarity as relevance measure. Your IR Engine should be capable of calculating TFIDF weights for all the terms in the collection and in the query.
Note: You only need to store term frequency tf in the forward and Inverted Index actually this is what we did in project The IDF and cosine similarity measure can be computed at runtime.
The Vector Space Model, cosine similarity measure are detailed in chapter and class slides. Figure describes the basic algorithm for computing vector space scores using inverted index.
There is a query file topics.txt containing four queries. You need to process them, search for each query in your index, rank the documents retrieved and store the output in a file. The format in which you have to store the output is explained in the readme.txt file.
Each query in the file contains additional information which you can make use of for this task. In particular, each query has three fields title, description and narrative. For this task, you can make a comparison of performance when you consider only the main query title when you consider the description along with the main query description title and when you consider the narrative along with the main query narrative title For performance measures, you can use Precision and Recall introduced in class.
Resources to be provided
Following are the files that you will need for this project:
main.qrels Relevance judgments file
topics.txt Queries
sampleoutput.txt A sample file showing how the output of your Processor should look like.
readme.txt Explains the format of each file in the directory.
Number:
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
