Question: Task 2: Similarity Measurement This task is about the difference between using euclidean distance or cosine similarity as the similarity measurement between two document vectors

 Task 2: Similarity Measurement This task is about the difference between

Task 2: Similarity Measurement This task is about the difference between using euclidean distance or cosine similarity as the similarity measurement between two document vectors in a vector space model. The euclidean distance of two vectors x-(x1, , Xn) and ?-(y1, , yn) ?s defined as The cosine similarity between the same vectors is defined as ?-? cos(x, y) - 1 Xi 1Vi Explain why it almost always is a bad choice to use euclidean distance for estimating the similarity between two documents vectors in a vector space model over tf-idf weights. Explain why cosine similarity is a better choice and how it alleviates the problem of using the euclidean distance

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!