Question: 2) Consider a Hadoop job that processes an input data file of size equal to 189 disk blocks (189 different blocks, not considering HDFS replication

2) Consider a Hadoop job that processes an input data file of size equal to 189 disk blocks (189 different blocks, not considering HDFS replication factor). The mapper in this job requires 1 minute to read and fully process a single block of data. Reducer requires 1 second (not minute) to produce an answer for one key worth of values and there are a total of 2000 distinct keys (mappers generate a lot more key-value pairs, but keys only occur in the 1-2000 range for a total of 2000 unique entries). Assume that each node has a reducer and that the keys are distributed evenly. The total cost will consist of time to perform the Map phase plus the cost to perform the Reduce phase. a) How long will it take to complete the job if you only had one Hadoop worker node? Assume that that only one mapper and only one reducer are created on every node. b) 30 Hadoop worker nodes? c) 60 Hadoop worker nodes? d) 100 Hadoop worker nodes? e) Would changing the replication factor have any affect your answers for a-d?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

3) Consider a Hadoop job that processes an input data file of size equal to 45 disk blocks (45 different blocks, you can assume that HDFS replication factor is set to 1). The mapper in this job...

1) Consider a Hadoop job that processes an input data file of size equal to 88 disk blocks (88 different blocks, you can assume that HDFS replication factor is set to 1). The mapper in this job...

Consider a Hadoop job that processes an input data file of size equal to 38 disk blocks (38 different blocks, you can assume that HDFS replication factor is set to 1). The mapper in this job requires...

please answer( C and D only) as soon as possible thank you Consider a small cluster with 20 machines: 19 DataNodes and 1 NameNode. Each node in the cluster has a total of 2 Terabyte hard disk space...

Briefly describe ASCII and Unicode and draw attention to any relationship between them. [3 marks] (b) Briefly explain what a Reader is in the context of reading characters from data. [3 marks] A...

Q# 0 1 . Consider a distributed storage system utilizing XOR - based erasure coding for fault tolerance. In this system, data is divided into blocks, and a parity block is generated using XOR...

I need a 10 page paper for my MIS class. Please do not copy and paste as my school is getting stricter on plagiarism. I have attached the assignment and the sample \fData Analytic Thinking 1 Data...

these below are solved examples These below need to be solved This is the only direction left with the assignment The attached workbook demonstrates the use of the Excel Data Analysis Toolkit for...

Let A, B be sets. Define: (a) the Cartesian product (A B) (b) the set of relations R between A and B (c) the identity relation A on the set A [3 marks] Suppose S, T are relations between A and B, and...

A creative engineer suggests structuring the TLB so that not all the bits of the presented address need match to result in a hit. Suggest how this might be achieved, and what might be the costs and...

The following is the preclosing trial balance of Ralph Retailers, Inc.: Preclosing Trial Balance as of December 31, 2014 The following additional information is provided: a. The company paid a salary...

Harambee Technology Pty Ltd is an ICT company that operates in many economic sectors. It has identified an opportunity to provide an integrated system, Enterprise Resource Planning (ERP), that will...

E8.20 (LO 4) (Retail Inventory Method) Presented below is information related to Luzon SA. Cost Retail Beginning inventory R$ 58,000 R$100,000 Purchases (net) 122,000 200,000 Net markups 20,000 Net...

Which of the following are problems with identifying users of ABC? Multiple select question. ABC means different things to different organizations. Organizations will announce the discontinuance of...

What kind of impression would you want to leave your audience with? What is the one thing youd like people to remember about you and your speech?

3 Are there some subjects or types of speeches that lend themselves to PowerPoint presentations? Are there others that dont?

Have you ever tried to sit through a speech when the speaker failed to use language the audience easily understood? Do you remember anything important from this speechor even its main point? How did...