Question: Q . 5 . Locality sensitive hashing ( LSH ) focuses on pairs of signatures likely to be from similar documents. It partitions the signature

Q.5. Locality sensitive hashing (LSH) focuses on pairs of signatures likely to be from similar
documents. It partitions the signature matrix M into b bands of r rows each. Suppose M has 100
rows and 100000 columns and b=20 bands and the goal is to find pairs of documents that are
at least 80% similar. Answer the following: 2**2.5=5M
a. If column 1 and 3 in M are 85% similar, then what is the probability that we may miss
this pair as a candidate pair? [10M]
b. If columns 1 and 6 in M are 60% similar, then what is the probability that they end up as
a candidate pair?
 Q.5. Locality sensitive hashing (LSH) focuses on pairs of signatures likely

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!