Question: We should avoid using groupByKey because of it . . . A . Always reads the data from HDFS and causes large data transport. B
We should avoid using groupByKey because of it
A
Always reads the data from HDFS and causes large data transport.
B
Shuffles all the keyvalue pairs data around and generates lots of unnecessary data transport. Also, it may cause memory problems because when grouping the values by key, all the data associated with a single key has to be collected on one worker node.
C
causes lots of communication with master node and it has lots of costs.
D
generates lots tiny small jobs compared to other transformation operations.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
