Question: Problem 4. Rob designs two algorithms for solving the Word Counting problem. The two algorithms are shown in the following table. Algorithm A Algorithm B
Problem 4. Rob designs two algorithms for solving the Word Counting problem. The two algorithms are shown in the following table.
| Algorithm A | Algorithm B |
| book = sc.textFile(/home/rob/data/peterpan.txt) book.count() book.first() wordCount = book.flatMap(lamba line : line.split( )) \ .map(lambda word : (word, 1)) \ .reduceByKey(lambda x, y : x + y ) wordcount.collect() | book = sc.textFile(/home/rob/data/peterpan.txt).persist() book.count() book.first() wordCount = book.flatMap(lamba line : line.split( )) \ .map(lambda word : (word, 1)) \ .reduceByKey(lambda x, y : x + y ) wordcount.collect() |
The only difference between Algorithm A and B is that we add .persist() at the end of the first line in Algorithm B.
In the Algorithm A, how many RDDs are there? Please tell the type of the RDD for each. Standard string RDD or key-value pair RDD? Please also explain the meaning of the elements in each RDD.
Answer:
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
