Question: Consider the following Spark Code to construction a Dataframe. a = [ ( ' Chris ' , 'Budweiser', 1 5 ) , ( ' Chris

Consider the following Spark Code to construction a Dataframe.
a =[('Chris', 'Budweiser', 15),('Chris', 'Becks', 5),('Chris', 'Heineken', 2),('Bob', 'Becks', 15),('Bob', 'Budweiser', 10),('Bob',
'Heineken', 2),('Alice', 'Heineken', 8)]
rdd = sc. parallelize (a)
df = sqIContext.createDataFrame(rdd,['drinker', 'beer', 'score'])
sqIContext.registerDataFrameAsTable(df, "drinkers")
How can we get the total score of each beer brand?
We want to have the following answer from the above example: (Your print out may be different - values are important)
beer='Becks', total score =20
beer='Budweiser', total score=25
beer='Heineken', total score=12
(Multiple Choice with Negative Scores for wrong Answers)
A. df.drop('drinker').groupByKey('beer').reduceByKey(add).collect()
B. df.drop('drinker').groupBy('beer').agg({'score': 'sum'}).collect()
C. df.drop('drinker').groupByKey('beer').map(lambda a, b: a+b).top()
D. df.filter('drinker').groupBy('beer').agg({'score': 'sum'}).collect()
E. df.filter('drinker').groupByKey('beer').reduceByKey(add).collect()
F. sqIContext.sql("SELECT beer, sum(score) from drinkers GROUP BY drinker").collect()
G. df.filter('drinker').groupByKey('beer').agg({'score': 'sum'}).collect()
H. df.drop('drinker').groupByKey('beer').map(lambda a, b: a+b).collect()
I. sqIContext.sqI("SELECT beer, sum(score) from drinkers GROUP BY beer").collect()
 Consider the following Spark Code to construction a Dataframe. a =[('Chris',

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!