Question: A multiprocessor machine has 1024 processors. On this machine we map a computation in which N iterate values must be computed and then exchanged between
A multiprocessor machine has 1024 processors. On this machine we map a computation in which N iterate values must be computed and then exchanged between the processors. Values are broadcast on a bus after each iteration. Each iteration proceeds in two phases. In the first phase each processor computes a subset of theN iterates. Each processor is assigned the computation ofK = N/P iterates, where P is the number of processors involved. In the second, communication phase each processor broadcasts its results to all other processors, one by one. Every processor waits for the end of the communication phase before starting a new computation phase. Let Tc be the time to compute one iterate and let Tb be the time to broadcast one value on the bus.We define the computation-to-communication ratio R as Tc/Tb. Note that, when P = 1, no communication is required. At first, we use the premise of Amdahls speedup (i.e., the same workload spread across an increasing number of processors). Under these conditions:
(a) Compute the speedup as a function of P and R, for K = 1, 2, . . . , 1024.
(b) Compute the maximum possible speedup as a function of P and R.
(c) Compute the minimum number of processors needed to reach a speedup greater than 1 as a function of P and R. Second, we use the premise of Gustafsons law, namely that the uniprocessor workload grows with the number of processors so that the execution time on the multiprocessor is the same as that on the uniprocessor. Assume that the uniprocessor workload computes 1024 iterates.
(d) What should the size of the workload be (as a number of iterates) when P processors are used, as a function of P and R? Pick the closest integer value for the number of iterates.
(e) Reconsider (a)(c) above in the context of growing workload sizes, according to Gustafsons law. Third, we now consider the overhead needed to broadcast values over the bus. Because of software and bus protocol overheads, each bus transfer requires a fixed amount of time, independent of the size of the transfer. Thus the time needed to broadcast K iterate values on the bus by each processor at the end of each iteration is now T0 + K Tb. (f) Using the constant workload size assumption (as in Amdahls law), what is the maximum possible speedup? (g) Using growing workload size assumption (as in Gustafsons law), what is the maximum possible speedup?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
