Question: Consider the following Spark Code to construction a Dataframe. a = [ ( ' Chris ' , 'Budweiser', 1 5 ) , ( ' Chris

Consider the following Spark Code to construction a Dataframe.

= [('

Chris

',

'Budweiser',

15), ('

Chris

',

'Becks',

5), ('

Chris

',

'Heineken',

2), ('

Bob

',

'Becks',

15), ('

Bob

',

'Budweiser',

10), ('

Bob

',

'Heineken',

2), ('

Alice

',

'Heineken',

8)]

rdd

=

.

parallelize

(a)

=

sqIContext.createDataFrame

(

rdd

, ['

drinker

',

'beer', 'score'

])

sqIContext.registerDataFrameAsTable

(

,

"drinkers"

)

How can we get the total score of each beer brand?

We want to have the following answer from the above example:

(

Your print out may be different

-

values are important

)

beer

=

'Becks', total score

= 20

beer

=

'Budweiser', total score

= 25

beer

=

'Heineken', total score

= 12

(

Multiple Choice with Negative Scores for wrong Answers

)

.

.

drop

('

drinker

') .

groupByKey

('

beer

') .

reduceByKey

(

add

) .

collect

()

.

.

drop

('

drinker

') .

groupBy

('

beer

') .

agg

({'

score

'

: 'sum'

}) .

collect

()

.

.

drop

('

drinker

') .

groupByKey

('

beer

') .

map

(

lambda a

,

b: a

+

) .

top

()

.

.

filter

('

drinker

') .

groupBy

('

beer

') .

agg

({'

score

'

: 'sum'

}) .

collect

()

.

.

filter

('

drinker

') .

groupByKey

('

beer

') .

reduceByKey

(

add

) .

collect

()

.

sqIContext.sql

("

SELECT beer, sum

(

score

)

from drinkers GROUP BY drinker"

) .

collect

()

.

.

filter

('

drinker

') .

groupByKey

('

beer

') .

agg

({'

score

'

: 'sum'

}) .

collect

()

.

.

drop

('

drinker

') .

groupByKey

('

beer

') .

map

(

lambda a

,

b: a

+

) .

collect

()

I. sqIContext.sqI

("

SELECT beer, sum

(

score

)

from drinkers GROUP BY beer"

) .

collect

()

Consider the following Spark Code to construction a Dataframe. a =[('Chris',

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Consider the following Spark Code to construction a Dataframe. a = [ ( ' Chris ' , 'Budweiser', 1 5 ) , ( ' Chris ' , 'Becks', 5 ) , ( ' Chris ' , 'Heineken', 2 ) , ( ' Bob ' , 'Becks', 1 5 ) , ( '...

i neeed this answer to this question and this question alone i will paypal you 200 dollars if you give it to me in percentage. 2. Under the status quo, what will MM Lager's sales revenues and E.C....

Consider and discuss the specific risks and nature of the company you will be auditing and create comprehensive work programs for the Acquisition, Payment, Property Plant, and Equipment (Fixed...

Consider and discuss the specific risks and nature of the company you will be auditing and create comprehensive work programs for the Inventory, Warehousing, and Payroll accounts and cycles. Submit a...

Learning Team assignments build upon each other from Weeks 2-5. The first step is to go to the website of a publically-traded US company and select the most recent 10-K Form (legally-required...

SEE BELOW ASSIGNMENT REQUIREMENTS! ALL WORK MUST BE ORIGINAL ! ANSWER WILL NOT BE ACCEPTED IF NOT MET REQUIREMENTS! Review the Form 10-K for the company selected for team assignments (ATTACHED) Write...

Review the Form 10-K for the company selected for team assignments Write a 1,050- to 1,400-word paper on the following: Management's Report on Internal Control Over Financial Reporting The...

Learning Team assignments build upon each other from Weeks 2-5. The first step is to go to the website of a publically-traded US company and select the most recent 10-K Form (legally-required...

I need help finishing the cash flow statement, as well as the table in green and cash flow valuation. If possible, please help on formulas, by typing them and listing the cell in Excel I should put...

Why do we outline the major steps performed in the use case?

Simphiwe has applied to the Johannesburg High Court to be declared insolvent and to have her estate sequestrated. A sequestration order is subsequently granted and she is declared insolvent. Simphiwe...

Better Mousetraps has developed a new trap. It can go into production for an initial investment in equipment of $6.3 million. The equipment will be depreciated straight-line over 6 years, but, in...

Identify the issues that arose over the call- centre strike in 2012. Put forward arguments for both the union and management sides in this issue. Discuss the proposition that the growth of call...

Read the articles and case studies on e- learning and blended learning and suggest how these might be used in (a) a supermarket chain, (b) a bank and (c) a government department of your choice.

Look up the case in spotlight on the law 9.3. Discuss how you would respond, as a tribunal member, to a claim by a citizen of Leeds that they have been discriminated against because of their location...