Question: Theme: Understanding relevance and centrality based information reterival systems. Task: Index the document collection provided for this assignment, and compare the retrieval performance using different

Theme: Understanding relevance and centrality based information reterival systems. Task: Index the document collection provided for this assignment, and compare the retrieval performance using different weighting methods. Weighting Method: (i) TFIDF, (ii) BM25 (iii) Page Ranks (iv) HITTS (v) weighted linear function combining relevance (TFIDF, BM25) and centrality measures (PageRank, HITTS). Dataset: Collect Wikimedia dump from given dataset link. Which contant list wekipedia page along with wekipedia title and text, hyperlinks. You should consider titles as the queries
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
