Question: Hello. I need help for my Computer Architecture class. I just want to compare my solution to you all if I did it right or
Using the pipeline latencies below, unroll the following loop with two iterations so it may be scheduled with reduced delay. Use register renaming. instruction rearrangement, or delay slot filling to optimize the unrolled loop. Assume a o ne-cycle delayed branch FO, 0(R1) FO,FO,F2 F4, O(R2) FO. FO. F.4 O(R2). F0 R1, R1, 8 R2, R2, 8 R1, Loop Loop LD LD ADDD SD SUBI SUBI BNEZ The latencies in clock cycles are: 3 for FP ALU Op producing a result for another FP ALU Op 2 for FP ALU Op producing a result for SD 1 for Load Double producing a result for a FP ALU Op 0 for Load Double producing a result for a Store Double
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
