Question: This problem explores the impact of Static and Dynamic Branch Predictions. Assume a 5-stage single-pipeline microarchitecture (Fetch, Decode, Execute, Memory, Write back) and the code
This problem explores the impact of Static and Dynamic Branch Predictions. Assume a 5-stage single-pipeline microarchitecture (Fetch, Decode, Execute, Memory, Write back) and the code is a backwards loop. All operations are 1 cycle except the following: LD or SD instruction takes 3 cycles at the stage of memory (MEM) access; branch instruction takes 2 cycles at the execution (EX) stage. Use forwarding and stalls if needed. Show the phases of each instruction per clock cycle for one iteration of the loop. Indicate at which cycle the rst instruction of next loop starts. Loop LD R3, 0(R5) LD R1, 0(R3) DADDI R1, R1, #1 DSUB R4, R3, R2 SD R1, 0(R3) BNZ R4, Loop LD R3, 0(R5) ; This is the rst instruction of the next loop (a) Without branch prediction: draw a diagram to show the pipeline stages for one iteration of the loop. How many clock cycles are required per loop iteration (also called loop length)? (b) Assume a static branch predictor, capable of recognizing a backwards branch in the Decode stage. Then a backwards branch is assumed always taken. Draw a diagram to show the pipeline stages for one iteration of the loop. How many clock cycles are required per loop iteration? (c) Assume a dynamic branch predictor which issues \predicted taken" as the branch instruction is fetched. Draw a diagram to show the pipeline stages for one iteration of the loop. How many clock cycles are required per loop iteration?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
