Question: Subject - Computer Science (Big Data Engineering) Note: Assignment last date 30-Jan. Early response appreciated. Will upvote all answer. If 12TB is the available disk
Subject - Computer Science (Big Data Engineering)
Note: Assignment last date 30-Jan. Early response appreciated. Will upvote all answer.

If 12TB is the available disk space per node (12 disks with 1 TB, 2 disk used for operating system etc. were excluded.). Assume the Disk space utilization as 65 % and Compression ratio as 2. a. What will be important factors that you will take into consideration for storing the data into HDFS? b. Assuming the initial data size is 500 TB. How will you estimate the number of data nodes? C. Business has predicted 10% data increase in a quarter. How would you predict the new machines to be added in next year? [2 + 5 + 3 = 10)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
