Question: IN R prog 1. **Big data tools:** The Hadoop Distributed File System (HDFS)allows us to manipulate massive amount of data using scalablecomputing power. Please answer
IN R prog
1. **Big data tools:** The Hadoop Distributed File System (HDFS)allows us to manipulate massive amount of data using scalablecomputing power. Please answer the questions below based on HDFS.You don't have to show the results, just explain.
a. Explain what the following commands do.
```{r, eval = FALSE}
hadoop fs -mkdir wordcount/input
hadoop fs -put myFile.txt myHdfs/test.dat
```
b Explain what the following `pig` commands will do.
```{r, eval = FALSE}
dat = LOAD 'myHdfs/test.dat';
d = LIMIT dat 10;
DUMP d;
```
c. Write down two differences between `Pig` and `Hive`.Which code will run faster?
d. If a data manipulation process takes 10 days tocomplete, what can you do to finish it in one day?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
