Question: Please solve all parts In this exercise we compare the performance of 1 - issue and 2 - issue processors, taking into account program transformations
Please solve all parts
In this exercise we compare the performance of issue and issue processors, taking into account program transformations that can
be made to optimize for issue execution. Problems in this exercise refer to the following loop written in C:
for ;;
;
A compiler doing little or no optimization might produce the following MIPS assembly code:
addi
jal ENT
TOP: slli
add
lw
lw
sub
add
sw
addi
ENT: bne TOP
The code above uses the following registers:
Assume the twoissue, statically scheduled processor for this exercise has the following properties:
One instruction must be a memory operation; the other must be an arithmeticlogic instruction or a branch.
The processor has all possible forwarding paths between stages including paths to the ID stage for branch resolution
The processor has perfect branch prediction.
Two instruction may not issue together in a packet if one depends on the other.
If a stall is necessary, both instructions in the issue packet must stall.
As you complete these exercises, notice how much effort goes into generating code that will produce a nearoptimal speedup.
a Draw a pipeline diagram showing how RISCV code given above executes on the twoissue processor. Assume that the loop exits
after two iterations.
b What is the speedup of going from a oneissue to a twoissue processor? Assume the loop runs thousands of iterations.
c Rearrangerewrite the RISCV code given above to achieve better performance on the oneissue processor. Hint: Use the
instruction "beqz $ Done" to skip the loop entirely if
d Rearrangerewrite the RISCV code given above to achieve better performance on the twoissue processor. Do not unroll the
loop, however.
e Repeat Exercise but this time use your optimized code from Exercise
f What is the speedup of going from a oneissue processor to a twoissue processor when running the optimized code from
Exercises and
g Unroll the RISCV code from Exercise so that each iteration of the unrolled loop handles two iterations of the original loop.
Then, rearrangerewrite your unrolled code to achieve better performance on the oneissue processor. You may assume that is a
multiple of
h Unroll the RISCV code from Exercise so that each iteration of the unrolled loop handles two iterations of the original loop.
Then, rearrangerewrite your unrolled code to achieve better performance on the twoissue processor. You may assume that is a
multiple of Hint: Reorganize the loop so that some calculations appear both outside the loop and at the end of the loop. You
may assume that the values in temporary registers are not needed after the loop.
i What is the speedup of going from a oneissue processor to a twoissue processor when running the unrolled, optimized code
from Exercises and
j Repeat Exercises and but this time assume the twoissue processor can run two arithmeticlogic instructions
together. In other words, the first instruction in a packet can be any type of instruction, but the second must be an arithmetic or
logic instruction. Two memory operations cannot be scheduled at the same time.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
