Question: Consider the following Spark Code to construction a Dataframe. a = [ ( ' Chris ' , 'Budweiser', 1 5 ) , ( ' Chris

Consider the following Spark Code to construction a Dataframe.
a =[('Chris', 'Budweiser', 15),('Chris', 'Becks', 5),('Chris', 'Heineken', 2),('Bob', 'Becks', 15),('Bob', 'Budweiser', 10),('Bob',
'Heineken', 2),('Alice', 'Heineken', 8)]
rdd = sc. parallelize(a)
df = sqIContext.createDataFrame(rdd,['drinker', 'beer', 'score'])
sqIContext.registerDataFrameAsTable(df, "drinkers")
How can we get the total score of each beer brand?
We want to have the following answer from the above example: (Your print out may be different - values are important)
beer='Becks', total score=20
beer='Budweiser', total score =25
beer='Heineken', total score =12
 Consider the following Spark Code to construction a Dataframe. a =[('Chris',

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!