Question: I just need the answer to part d , e , and f . Thank you! ( a ) A scalar pipeline has 4 pipeline

I just need the answer to part d, e, and f. Thank you!
(a) A scalar pipeline has 4 pipeline stages A, B, C, D. A data item entering this pipeline must be
processed by the four units A, B, C, and D in the order A > B > C > D. Suppose that units A, B, D
take 2 clock cycle to process a data item, and unit C takes 3 clock cycles to process a data item. How
many clock cycles does it take for this scalar pipeline to process 1000 independent operations? What is
the speedup of this scalar pipeline?
(b) If the 3rd stage C is increased to 2 same-function units to make the scalar pipeline a superscalar
pipeline, then how many clock cycles does it take for this superscalar pipeline to process 1000
independent operations? How many cycles does it take to process 1 operation by this superscalar
pipeline.
(c) Instead of increasing the number of processing units at stage C to 2, we partition stage A into substages A1 and A2, partition stage B into B1 and B2, partition stage C into C1, C2, and C3, and partition
stage D into D1 and D2, with each sub-stage having a processing latency of 1 cycle. Then we can view
this new pipeline as a 9-stage scalar pipeline. For this 9-stage scalar pipeline, how many clock cycles
does it take this scalar pipeline to process 1000 independent operations?
(d) Suppose that there are 1000 operations to be processed by the 9-stage scalar pipeline in (c), the
1000 operations have the data dependence only between the 8-th and the 10-th operations in the
following way: the input to the stage B1 of the 10-th operation is the output of the stage C3 of the 8-th
operation. Then how many cycles does it take this scalar pipeline to process these 1000 operations?
(e) If the dependency in (d) is detectable during software development, can you think of the software
solution method to solve eliminate the stalling caused by the dependence described in (d)?
(f) Instead of further partitioning each stage into sub-stages as in (c), we use two processing units for
stages A, B, D, and use three processing units for stage C, obtaining a superscalar pipeline. Then how
many clock cycles does it take this superscalar pipeline to process 1000 independent operations?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!