Question: IN R prog 1. **Big data tools:** The Hadoop Distributed File System (HDFS)allows us to manipulate massive amount of data using scalablecomputing power. Please answer

IN R prog

1. **Big data tools:** The Hadoop Distributed File System (HDFS)allows us to manipulate massive amount of data using scalablecomputing power. Please answer the questions below based on HDFS.You don't have to show the results, just explain.

a. Explain what the following commands do.

```{r, eval = FALSE}

hadoop fs -mkdir wordcount/input

hadoop fs -put myFile.txt myHdfs/test.dat

```

b Explain what the following `pig` commands will do.

```{r, eval = FALSE}

dat = LOAD 'myHdfs/test.dat';

d = LIMIT dat 10;

DUMP d;

```

c. Write down two differences between `Pig` and `Hive`.Which code will run faster?

d. If a data manipulation process takes 10 days tocomplete, what can you do to finish it in one day?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!