Question: Problem 4. Rob designs two algorithms for solving the Word Counting problem. The two algorithms are shown in the following table. Algorithm A Algorithm B


Problem 4. Rob designs two algorithms for solving the Word Counting problem. The two algorithms are shown in the following table. Algorithm A Algorithm B book sc.textFile"/home/rob/data/peterpan.txt") book sc.textFile"/home/rob/data/peterpan.txt").persist() book.count() book.first) wordCount = book.flatMap(lamba line: line.split("")) \ | wordCount = book.flatMap(lamba line : line.split'" book.count() book.first() map(lambda word (word, 1)) reduceByKey(lambda x, y x y) .map(lambda word: (word, 1)) reduceByKey(lambda x, y x y) wordcount.collect() wordcount.collect() The only difference between Algorithm A and B is that we add ".persist()" at the end of the first line in Algorithm B. Which one (Algorithm A or B) runs faster and why? Answer Instead of persist(), we can also use cache(). What is the difference between persist() and cache()
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
