Question: Problem 4. Rob designs two algorithms for solving the Word Counting problem. The two algorithms are shown in the following table. Algorithm A Algorithm B

Problem 4. Rob designs two algorithms for solving the Word Counting problem. The two algorithms are shown in the following table.

Algorithm A

Algorithm B

book = sc.textFile(/home/rob/data/peterpan.txt)

book.count()

book.first()

wordCount = book.flatMap(lamba line : line.split( )) \

.map(lambda word : (word, 1)) \

.reduceByKey(lambda x, y : x + y )

wordcount.collect()

book = sc.textFile(/home/rob/data/peterpan.txt).persist()

book.count()

book.first()

wordCount = book.flatMap(lamba line : line.split( )) \

.map(lambda word : (word, 1)) \

.reduceByKey(lambda x, y : x + y )

wordcount.collect()

The only difference between Algorithm A and B is that we add .persist() at the end of the first line in Algorithm B.

In the Algorithm A, how many RDDs are there? Please tell the type of the RDD for each. Standard string RDD or key-value pair RDD? Please also explain the meaning of the elements in each RDD.

Answer:

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!