Question: Describe how you would solve the following problem with MapReduce. Problem: The input is a big file that contains information about many houses. Each house
- Describe how you would solve the following problem with MapReduce.
Problem: The input is a big file that contains information about many houses. Each house is represented by one line in the file: (address, city, state, zip, value). The final output should be the average house value in each zip code.
- You should explain how the input is mapped into (key, value) pairs in the map stage, i.e., specify what is the key and what is the associated value in each pair, and, if needed, how the key(s) and values(s) are obtained.
- You need to mention how the shuffle process is conducted.
Shuffle
- You should also explain how the (key,value) pairs produced by the map stage are processed by the reduce stage to get the final answers
Generate the average house value for a zip code by summing up the values and dividing it by the count
You may draw a figure with some simple examples (as the word count application in slides).
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
