Question: The second homework focuses on index construction and vector space model, which includes: 1. Consider the following document-term table with 10 documents and 8 terms

The second homework focuses on index construction and vector space model, which includes: 1. Consider the following document-term table with 10 documents and 8 terms (A through H) containing raw term frequencies. We also have a specified query, Q, with the indicated raw term weights (the bottom row in the table). Answer the following questions, and in each case give the formulas you used to perform the necessary computations. Note: do this using a spreadsheet program such as Microsoft Excel. Alternatively, you can write a program to perform the computations. Please include your worksheets or code in the assignment submission. A B C D E F G H doc1 0 3 4 0 0 2 4 doc2 5 5 0 0 4 04 doc3 doc4 doc5 doch doc7 3 5 3 doc8 0 3 0 0 0 4 4 doc9 0 0 3 3 3 0 0 doc10 0 5 0 0 0 4 4 2 Query 2 1 1 0 2 0 3 0 (a) Compute the ranking score for each document based on each of the following query-document similarity measures (sort the documents in the decreasing order of the rank score): Dot product Cosine similarity Dice's coefficient Jaccard's Coefficient OWNOW toon ooww Ottotrotto ANN-NWOW 4 The second homework focuses on index construction and vector space model, which includes: 1. Consider the following document-term table with 10 documents and 8 terms (A through H) containing raw term frequencies. We also have a specified query, Q, with the indicated raw term weights (the bottom row in the table). Answer the following questions, and in each case give the formulas you used to perform the necessary computations. Note: do this using a spreadsheet program such as Microsoft Excel. Alternatively, you can write a program to perform the computations. Please include your worksheets or code in the assignment submission. A B C D E F G H doc1 0 3 4 0 0 2 4 doc2 5 5 0 0 4 04 doc3 doc4 doc5 doch doc7 3 5 3 doc8 0 3 0 0 0 4 4 doc9 0 0 3 3 3 0 0 doc10 0 5 0 0 0 4 4 2 Query 2 1 1 0 2 0 3 0 (a) Compute the ranking score for each document based on each of the following query-document similarity measures (sort the documents in the decreasing order of the rank score): Dot product Cosine similarity Dice's coefficient Jaccard's Coefficient OWNOW toon ooww Ottotrotto ANN-NWOW 4
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
