Question: What needs to be added to the code to return a new RDD with all keys where the absolute difference between their keys and values

What needs to be added to the code to return a new RDD with all keys where the absolute difference between their keys and values is greater than 2?
rdd = sc.parallelize([(1,2),(2,4),(3,6),(4,1)])
# The results need to be a list of keys. In this case result =[3,4]
A.
rdd1= rdd.map(lambda x: (x, abs(x[0]- x[1]))).filter(lambda x: x[1]>2).map(lambda x: x[0])
B.
rdd1= rdd.map(lambda x: (x[0]- x[1], x)).filter(lambda x: x[0]>2).map(lambda x: x[1])
C.
rdd1= rdd.flatMap(lambda x: [(x[0], i) for i in x[1]]).filter(lambda x: abs(x[0]- x[1])>2).map(lambda x: (x[0],[x[1]])).reduceByKey(lambda x, y: x + y).flatMap(lambda x: [(x[0], i) for i in x[1]])
D.
rdd1= rdd.map(lambda x: (x, x[0]- x[1])).filter(lambda x: x[1]>2).map(lambda x: x[0])

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!