Question: 1) Consider a Hadoop job that processes an input data file of size equal to 88 disk blocks (88 different blocks, you can assume that

1) Consider a Hadoop job that processes an input data file of size equal to 88 disk blocks (88 different blocks, you can assume that HDFS replication factor is set to 1). The mapper in this job requires 2 minutes to read and fully process a single block of data. Reducer requires 1 second (not minute) to produce an answer for one key worth of values and there are a total of 6000 distinct keys (mappers generate a lot of key-value pairs, but keys only occur in the 1-6000 range for a total of 6000 unique entries). Assume that each node has a reducer and that the keys are distributed evenly.

a) How long will it take to complete the job if you only had one Hadoop worker node? For the sake of simplicity, assume that that only one mapper and only one reducer are created on every node.

b) 30 Hadoop worker nodes?

c) 50 Hadoop worker nodes?

d) 100 Hadoop worker nodes?

e) Would changing the replication factor have any affect your answers for a-d? You can ignore the network transfer costs as well as the possibility of node failure.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

3) Consider a Hadoop job that processes an input data file of size equal to 45 disk blocks (45 different blocks, you can assume that HDFS replication factor is set to 1). The mapper in this job...

2) Consider a Hadoop job that processes an input data file of size equal to 189 disk blocks (189 different blocks, not considering HDFS replication factor). The mapper in this job requires 1 minute...

Consider a Hadoop job that processes an input data file of size equal to 38 disk blocks (38 different blocks, you can assume that HDFS replication factor is set to 1). The mapper in this job requires...

please answer( C and D only) as soon as possible thank you Consider a small cluster with 20 machines: 19 DataNodes and 1 NameNode. Each node in the cluster has a total of 2 Terabyte hard disk space...

Name: T.A. name/Class time: MW Lecturer: Lab 11: Multiple Regression Open the lab dataset on Blackboard Learn or previously saved in SPSS. This dataset contains the data for 96 subjects. The...

ttth Suppose that the sequence of bags {Bn | n N} is recursively enumerated by the computable function e(n, x) = fn(x), [7 marks] Hence prove that the set of all recursive bags cannot be recursively...

A creative engineer suggests structuring the TLB so that not all the bits of the presented address need match to result in a hit. Suggest how this might be achieved, and what might be the costs and...

Multiple CHoice QUestions Question 80 Some filesystems require ______________ tools to restore the performance on mechanical drives, which have sections of the filesystem become non-contiguous. Save...

Portray in words what transforms you would have to make to your execution to some degree (a) to accomplish this and remark on the benefits and detriments of this thought.You are approached to compose...

Briefly describe ASCII and Unicode and draw attention to any relationship between them. [3 marks] (b) Briefly explain what a Reader is in the context of reading characters from data. [3 marks] A...

Look again at the valuation in Table of the option to invest in the Mark II project. Consider a change in each of the following inputs. Would the change increase or decrease the value of the...

The electric field and magnetic field in free space are given by E = H = 50 ~cos(10 t + Bz) a V/m Ho cos(10t+ z) ap A/m Express these in phasor form and determine the constant H, and such that the...

Consider a project with free cash flows in one year of $ 1 4 5 , 0 0 0 ( weak economy ) or $ 1 9 2 , 0 0 0 ( strong economy ) , with each outcome being equally likely. The initial investment required...

Assume Andersons General Store bought, on credit, a truckload of merchandise from American Wholesaling costing $24,200. If Andersons paid National Trucking $770 cash for transportation, immediately...

Understand the part played by CRM systems in delivering customized services and building loyalty.

Understand what factors cause customers to switch to a competitor, and how to reduce such switching.

Understand how customers respond to effective service recovery.