Question: Exercise C Show how MapReduce can be used to efficiently solve the following problem: Given a collection of input documents, output all bigrams with pointwise
Exercise C Show how MapReduce can be used to efficiently solve the following problem: Given a collection of input documents, output all bigrams with pointwise mutual information greater than a constant T. The pointwise mutual information of two words a,b is computed as P(a,b)/(P(a)P(b)), where P(a,b) is the probability they appear together as bigam a,b. Write pseudocode for map and reduce functions. How would you use a Combiner to optimize your program
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
