Question: In the Document Distance problem from the first two lectures, we compared two documents by counting the words in each, treating theses counts as vectors,
In the Document Distance problem from the first two lectures, we compared two documents by counting the words in each, treating theses counts as vectors, and computing the angle between these two vectors. For this problem, we will change the Document Distance code to use a new metric. Now, we will only care about words that show up in both documents, and we will ignore the contributions of words that only show up in one document. Modify inner product to take a third argument, domain, which will be a set containing the words in both texts. Modify the code so that it only increases sum if the word is in domain. Dont forget to change the documentation string at the top. Modify vector angle so that it creates sets of the words in both L1 and L2, takes their intersection, and uses that intersection when calling inner product. Again, dont forget to change the docstring at the top.
import math
def inner_product(L1,L2): """ Inner product between two vectors, where vectors are represented as alphabetically sorted (word,freq) pairs.
Example: inner_product([["and",3],["of",2],["the",5]], [["and",4],["in",1],["of",1],["this",2]]) = 14.0 """ sum = 0.0 i = 0 j = 0 while i def vector_angle(L1,L2): """ The input is a list of (word,freq) pairs, sorted alphabetically. Return the angle between these two vectors. """ numerator = inner_product(L1,L2) denominator = math.sqrt(inner_product(L1,L1)*inner_product(L2,L2)) return math.acos(numerator/denominator) docdist: https://ide.geeksforgeeks.org/tR2RiFqztD
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
