Suppose you are designing a machine that will frequently have to perform 64 consecutive iterations of the

Question:

Suppose you are designing a machine that will frequently have to perform 64 consecutive iterations of the same task (for example, a vector processor with 64-element vector registers). You want to implement a pipeline that will help speed up this task as much as is reasonably possible, but recognize that dividing a pipeline into more stages takes up more chip area and adds to the cost of implementation.

a. Make the simplifying assumptions that the task can be subdivided as finely or coarsely as desired and that pipeline registers do not add a delay. Also assume that one complete iteration of the task takes 16 ns (thus, a nonpipelined implementation would take 64 × 16 = 1024 ns to complete 64 iterations). Consider possible pipelined implementations with 2, 4, 8, 16, 24, 32, and 48 stages. What is the total time required to complete 64 iterations in each case? What is the speedup in each case? Considering cost as well as performance, what do you think is the best choice for the number of stages in the pipeline? Explain.

b. Now assume that a total of 32 levels of logic gates are required to perform the task, each with a propagation delay of 0.5 ns (thus, the total time to produce a single result is still 16 ns). Logic levels cannot be further subdivided. Also assume that each pipeline register has a propagation delay equal to that of two levels of logic gates, or 1 ns. Reanalyze the problem; does your previous recommendation still hold? If not, how many stages would you recommend for the pipelined implementation under these conditions?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Question Posted: