Question: this question in information retrieval (IR) Requirements: Consider the term-document count matrix for a collection of 10 documents and do the following. (2points each) Q1)
Requirements: Consider the term-document count matrix for a collection of 10 documents and do the following. (2points each) Q1) Calculate the IDF vector for the terms above. Q2) Calculate the TF-IDF vectors for documents 1 to 7. Q3) Using the values in Q2, calculate the Lu norm for documents 1 to 7. Q4) Using the answers of Q2 and Q3, compute the cosine similarity score between Docl and documents 2 to 7. ( i.e. cos (Doc1, Doc2), cos (Doc1, Doc3), cos (Doc1, Doc4), ..., cos (Docl. Doc7)). Q5) Rank the documents (2-7) according to the similarity with Docl: from the most similar to the least similar
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
