Question: 1) Consider the following documents D and query Q for the following: D1: you say goodbye D2: hello goodbye hello goodbye hello D3: I say

1) Consider the following documents D and query Q for the following:

D1: you say goodbye

D2: hello goodbye hello goodbye hello

D3: I say hello

Q1: I hello a.

a.Construct the vector space term-document matrix for the above documents using tf.idf term weighting.

b.Compute the similarity between Q and the above documents using tf.idf weight and the three (3) similarity measures:

- Inner product

- Cosine

- Jaccard

Determine their relative ranking.

1) Consider the following documents D and query Q for the following:

APPENDIX A tf.idf wij=tfijlog(dfiD) where; ttij= number of term i in document j D= number of document in a database dfi= number of documents in a database containing term i Inner (dot) product similarity measure sim(Di,Q)=k=1t(dikqk) where; Di= document i Q= query dik= the weight of term k in document i qk= the weight of term k in the query Cosine similarity measure Cos(Di,Q)=k=1tdik2k=1tqk2k=1t(dikqk) where; Di= document i Q= query dik= the weight of term k in document i qk= the weight of term k in the query Jaccard similarity measure Jaccard(Di,Q)=k=1tdik2+k=1tqk2k=1t(dikqk)k=1t(dikqk) where; Di= document i Q= query dik= the weight of term k in document i qk= the weight of term k in the query

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!