Question: 1 . ) ( 2 5 points ) a - ) What is the baseline performance ( in cycles, per loop iteration ) of the
points
a What is the baseline performance in cycles, per loop iteration of the code sequence in Figure
if no new instruction's execution could be initiated until the previous instruction's execution had
completed? Ignore frontend fetch and decode. Assume that execution does not stall for lack
of the next instruction, but only one instructioncycle can be issued. Assume the branch is taken,
and that there is a onecycle branch delay slot. In the following code, you may assume is as
register, register is as register
Figure : Code and latencies for question
b Considering true data dependencies and functional unit latencies, reorderschedule the
instructions to improve performance of the code in Figure Calculate the required cycles per
iteration of the loop.
c Using different registers to prevent name dependencies, handunroll two iterations of the loop in
your reordered code obtained from b Calculate the required cycles per iteration of the loop.
d Now, reorderschedule the unrolled code obtained from c Calculate the required cycles per
iteration of the loop.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
