Question: - For the three questions, use the process shown on the previous slide. Prepare the answers to these questions from now. You will have these
- For the three questions, use the process shown on the previous slide. Prepare the answers to these questions from now. You will have these three exact/same questions in Quiz\#5: 1) In the context of "How Does MapReduce Work", start from the left of the process, a set of raw data arrives, it gets split into two subsets, each is sent to a separate computer for parallel processing (one subset at the top computer, and the other at the bottom computer). Next, look just after the large arrow at the top and just after the large arrow at the bottom, that's the first step for the Map Function. What did the Map Function accomplish/do first? (Hint: look at how the shapes are now organized/shown). - 2) Again, in the context of "How Does MapReduce Work", now, let us look at the second step of the Map Function, move to the right just before the Reduce Function. What did the Map Function accomplish/do next? (Hint: look at how the shapes are now organized/shown). - 3) Finally, in the context of "How Does MapReduce Work", now let us move all the way to the right for the Reduce Function. What did the Reduce Function accomplish/do finally? - MapReduce distributes the processing of very large multistructured data files across a large cluster of ordinary machines/processors - Goal - achieving high performance with "simple" computers - Developed and popularized by Google - Good at processing and analyzing large volumes of multistructured data in a timely manner - Example tasks: indexing the Web for search, graph analysis, text analysis, machine learning, ... Big Data Technologies--MapReduce (2 of 2) - How does MapReduce work? - For the three questions, use the process shown on the previous slide. Prepare the answers to these questions from now. You will have these three exact/same questions in Quiz\#5: 1) In the context of "How Does MapReduce Work", start from the left of the process, a set of raw data arrives, it gets split into two subsets, each is sent to a separate computer for parallel processing (one subset at the top computer, and the other at the bottom computer). Next, look just after the large arrow at the top and just after the large arrow at the bottom, that's the first step for the Map Function. What did the Map Function accomplish/do first? (Hint: look at how the shapes are now organized/shown). - 2) Again, in the context of "How Does MapReduce Work", now, let us look at the second step of the Map Function, move to the right just before the Reduce Function. What did the Map Function accomplish/do next? (Hint: look at how the shapes are now organized/shown). - 3) Finally, in the context of "How Does MapReduce Work", now let us move all the way to the right for the Reduce Function. What did the Reduce Function accomplish/do finally? - MapReduce distributes the processing of very large multistructured data files across a large cluster of ordinary machines/processors - Goal - achieving high performance with "simple" computers - Developed and popularized by Google - Good at processing and analyzing large volumes of multistructured data in a timely manner - Example tasks: indexing the Web for search, graph analysis, text analysis, machine learning, ... Big Data Technologies--MapReduce (2 of 2) - How does MapReduce work