Question: Consider a distributed system that consists of S computing nodes. Consider a distributed clustering technique that consists of two main steps: 1) Each node executes

Consider a distributed system that consists of S computing nodes. Consider a distributed clustering technique that consists of two main steps:

1) Each node executes a local clustering algorithm on their local data.

2) Each node sends its results to the server (a node that is elected to be the server) to aggregate the local results of each node to produce global clusters. The main steps of the algorithm are as follows:

Step 1: Given S nodes, partition the data objects into S nonempty subsets.

Step 2: Distribute the subsets among the S computing nodes.

Step 3: Execute on each node a clustering algorithm on its local data.

Step 4: Each node sends its results to the server. Step 5: Aggregate the local results to produce global clusters.

i) Recall the main concept of Map/Reduce.

ii) Define the inputs and the outputs of the Map and Reduce functions for this distributed algorithm.

iii) Using Map/Reduce model, define the mapper and reducer of this distributed algorithm.

iv) S is usually chosen dynamically by the cloud resource manager. How this would affect the results of the algorithm?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!