Question: QUESTION 2 (a) Imagine you have a client application which is reading data from a Data Node from an HDFS cluster and the data arrives

QUESTION 2 (a) Imagine you have a client application which is reading data from a Data Node from an HDFS cluster and the data arrives corrupted. In such a scenario, discuss the mechanism that HDFS implements to ensure data integrity. (7 marks) (b) Consider the following case scenario: In a Hadoop cluster, we want to store some data and we have 4 files of size 110Kb, 64.2Mb, 200Mb, and 109Mb. HDFS block size is set to 64 MB. How many input splits (number of blocks) in total will be made by Hadoop framework to store all four files? Also list the total number of splits (and the block size) for each file. (10 marks) (c) Hadoops HDFS allows you to re-configure the replication factor for already stored data files. Explain with the help of examples, what happens if you increase and decrease the replication factor from previously configured value? (8 marks) Total (25 marks)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
