Question: Consider the following code: Loop L.D F2, 0(R1) L.D F4, -8(R1) MUL.D F6, F2, F4 S.D F6, 0(R2) DADDUI R1, R1, #-16 DADDUI R2, R2,

Consider the following code: Loop L.D F2, 0(R1) L.D F4, -8(R1) MUL.D F6, F2, F4 S.D F6, 0(R2) DADDUI R1, R1, #-16 DADDUI R2, R2, #-8 BNE R1, R3, Loop Assume the same latency and initiation interval of Fig C.34. Ignore the delay caused by the branch. (a) Show the timing diagram of this code with full forwarding hardware. Use timing diagram like that shown in Fig C.5. How many cycles does this loop take to execute? (b) Use static scheduling to reduce the total number of stalls in the loop. Show the new timing diagram. In this case, how many cycles does this loop take to execute? (c) Use loop unrolling and static scheduling so that the pipeline does not stall at all. What is the minimum number of unrolls that is required to remove all the stalls? How many cycles does this loop take to execute? How many new registers that you need to implement this unrolling? Show the new timing diagram, represented like that shown in lecture. (d) Compare the number of clocks required in each case. Assuming in nite unrolls, what is the minimum number of clocks that is required to imple- ment one iteration of THE ORIGINAL loop?

NOTE: BOOK REFERECE( computer arcitecture Quantitative approach 5th edition)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!