Question: from pyspark import SparkContext sc = SparkContext ( appName = RDDComparison ) rdd = sc . parallelize ( [ ( 1 , 2 ) ,

from pyspark import SparkContext
sc = SparkContext(appName="RDDComparison")
rdd = sc.parallelize([(1,2),(2,4),(3,6),(4,1)])
rdd1_a = rdd.map(lambda x: (x, abs(x[0]- x[1]))).filter(lambda x: x[1]>2).map(lambda x: x[0])
rdd1_b = rdd.map(lambda x: (x[0]- x[1], x)).filter(lambda x: abs(x[0])>2).map(lambda x: x[1])
rdd1_c = rdd.flatMap(lambda x: [(x[0], i) for i in x[1]]).filter(lambda x: abs(x[0]- x[1])>2).map(lambda x: (x[0],[x[1]])).reduceByKey(lambda x, y: x + y).flatMap(lambda x: [(x[0], i) for i in x[1]])
rdd1_d = rdd.map(lambda x: (x, x[0]- x[1])).filter(lambda x: x[1]>2).map(lambda x: x[0])
print("Option A Result:", rdd1_a.collect())
print("Option B Result:", rdd1_b.collect())
print("Option C Result:", rdd1_c.collect())
print("Option D Result:", rdd1_d.collect())
sc.stop()

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!