Which of the following distance measures is commonly applied to a frequency-document matrix and why? a. Jaccard

Question:

Which of the following distance measures is commonly applied to a frequency-document matrix and why?

a. Jaccard distance—because text is expressed as binary variables in a frequencydocument matrix.

b. Manhattan distance—because text is expressed quantitatively in a frequencydocument matrix and it avoids the effect of outliers.

c. Euclidean distance—because text is expressed quantitatively in a frequencydocument matrix and outliers have been filtered out by upper and lower limits on term frequencies.

d. Cosine distance—because it identifies similarity in term usage patterns instead of the magnitudes in term frequency measures.

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Business Analytics

ISBN: 9780357902219

5th Edition

Authors: Jeffrey D. Camm, James J. Cochran, Michael J. Fry, Jeffrey W. Ohlmann

Question Posted: