Question: Part 2: A simple Spark application In this part, you will implement a simple Spark application. We have provided some sample data collected by IOT

Part 2: A simple Spark application

In this part, you will implement a simple Spark application. We have provided some sample data collected by IOT devices, screenshots are below. You need to sort the data firstly by the country code alphabetically (the third column) then by the timestamp (the last column). Here is an example:

Input:

cca2 device_id timestamp
US 1 1
IN 2 2
US 3 2
CN 4 4
US 5 3
IN 6 1

Output:

cca2 device_id timestamp
CN 4 4
IN 6 1
IN 2 2
US 1 1
US 3 2
US 5 3

You should first load the data into HDFS.

Part 2: A simple Spark application In this part, you will implement

Then, write a Spark program in Java/Python/Scala to sort the data. Finally, output the results into HDFS in form of csv. Your program will take in two arguments, the first a path to the input file and the second the path to the output file. Note that if two data tuples have the same country code and timestamp, the order of them does not matter.

a simple Spark application. We have provided some sample data collected by

IOT devices, screenshots are below. You need to sort the data firstly

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!