Question: Implement, run and time the following query using Hadoop streaming with python. SELECT lo_quantity, MIN (lo_revenue) FROM (SELECT lo_revenue, MAX(lo_quantity) as lo_quantity, MAX(lo_discount) as lo_discount

Implement, run and time the following query using Hadoop streaming with python. SELECT lo_quantity, MIN (lo_revenue) FROM (SELECT lo_revenue, MAX(lo_quantity) as lo_quantity, MAX(lo_discount) as lo_discount FROM lineorder WHERE lo_orderpriority LIKE '%URGENT' GROUP BY lo_revenue) WHERE lo_discount BETWEEN 6 AND 8 GROUP BY lo_quantity; This requires running two different map reduce jobs. First, you would write a job that executes the subquery and produces an output in HDFS. Then you would write a second job that uses output of the first job as the input.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!