Question: Consider that you are given a file containing the annotated form of the Mahabharata which runs into 4GB as a text file. The Mahabharata
Consider that you are given a file containing the annotated form of the Mahabharata which runs into 4GB as a text file. The Mahabharata is broken into 18 chapters of parvas and each parva had many shlokas. Different shlokas were given to different scholars for translation to English and each shloka and its translation were entered into a web page that accepted data in the following format and stored it on a text file Parva Number, Shloka Number, Translation And hence were in random order in the file "Mahabharata.txt" which was stored on HDFS. Design a MapReduce program to sort all the shlokas and their translations in the right order both based on the Parva and the shloka within it. You need not write entire map-reduce code, but need to identify the intermediate keys of the mapper and corresponding values. The keys at the reducer and the corresponding output values. Do you need anything else to make this work? How many mappers do you expect to start if you are using Hadoop v2?
Step by Step Solution
There are 3 Steps involved in it
To sort all the shlokas and their translations in the right order based on both the Parva number and the shloka number within it you can design a MapR... View full answer
Get step-by-step solutions from verified subject matter experts
