Question: Minimal Map Reduce ( 1 0 ) Suppose we wish to count the number of characters in text files on a system of three nodes
Minimal Map Reduce Suppose we wish to count the number of characters in text
files on a system of three nodes Alpha Beta, Gamma and we wish to use the Map Reduce paradigm. For
the sake of this example, we will assume the following distribution of the data across the nodes:
Node File Contents
Alpha abcdabcdabcd
Beta axyaxyaxy
Gamma adyadyady
Write a pseudo code implementation of a Map and Reduce function to calculate the character count
see Dean et al for inspiration
Provide the output of the map tasks and how the output is distributed over nodes eg Node: key
valuekey value Node: key value
Suppose you have two reducer processes say R and R that will execute your custom reducer
code, which data will be processed at each reducer eg R: key value R: key valuekey
value
During this entire process, which data are written to disk?
You wish to reduce network traffic between the mappers and the reducers using an additional stage,
what is this stage called and how would it act on the data in this example? Demonstrate this by
modifying the keyvalue pairs as necessary.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
