Question: Please answer A, if you can also answer B I will be greatly appreciative. Consider the following document-term table with 10 documents and 8 terms

 Please answer A, if you can also answer B I will

Please answer A, if you can also answer B I will be greatly appreciative.

Consider the following document-term table with 10 documents and 8 terms (A through H ) containing raw term frequencies. We also have a specified query, Q, with the indicated raw term weights (the bottom row in the table). Answer the following questions, and in each case give the formulas you used to perform the necessary computations. Note: do this using a spreadsheet program such as Microsoft Excel. Alternatively, you can write a program to perform the computations. Please include your worksheets or code in the assignment submission. (a) Compute the ranking score for each document based on each of the following query-document similarity measures (sort the documents in the decreasing order of the rank score): - Dot product - Cosine similarity - Jaccard's Coefficient - Dice's coefficient, formula of Dice (A,B)=2AB/(A+B) (b) Construct a similar table to above, but instead of raw term frequencies compute the tf-idf weights for the terms (not normalized). Then compute the ranking scores using cosine similarity. Explain any significant differences between the ranking you obtained here and the Cosine ranking from the previous part

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!