Question: Consider the all - reduce operation in which each processor starts with an array of m words, and needs to get the global sum of

Consider the all-reduce operation in which each processor starts with an array of m words, and needs to get the global
sum of the respective words in the array at each processor. This operation can be implemented on a ring using one of
the following three alternatives:
(i) All-to-all broadcast of all the arrays followed by a local computation of the sum of the respective elements of the
array.
(ii) Single node accumulation of the elements of the array, followed by a one-to-all broadcast of the result array.
(iii) An algorithm that uses the pattern of the all-to-all broadcast, but simply adds numbers rather than concatenating
messages.
(a)(9 points) For each of the above cases, compute the run time in terms of m, ts, and tw.
(b)(1 point) Assume that ts =100, tw =1, and m is very large. Which of the three alternatives (among (i),(ii) or
(iii)) is better?
(c)(1 point) Assume that ts =100, tw =1, and m is very small (say 1). Which of the three alternatives (among (i),
(ii) or (iii)) is better?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!