Question: 5. (28 pts.) The following compiled code runs on the 5-stage pipelined MIPS processor with 1.6 GHz clock speed and supply voltage of 1.2 V.

 5. (28 pts.) The following compiled code runs on the 5-stage

pipelined MIPS processor with 1.6 GHz clock speed and supply voltage of

5. (28 pts.) The following compiled code runs on the 5-stage pipelined MIPS processor with 1.6 GHz clock speed and supply voltage of 1.2 V. RF write and read can be done in one cycle. addi $80, $zero, 1000 again: beq $80, $zero, out addi $80, $s0, -4 lw $s4, 0($sl) add $s2, $4, $s2 add $s5, $s2, $s 0 addi $sl, $80, 5 j again out: add $v0, $zero, $s2 a) (6 pts.) Assume no implementation for hazard detection, no branch delay scheduling, and the branch condition is evaluated in the Execution stage. What is the execution time, after taking into account all stalls? b) (6 pts.) Given the following characterized power parameters per pipeline stage, calculate the average power dissipation while running the given code on the processor. Assume the processor has no signal switching at any stage during stall cycles. Stage Cayn (nF) Istatic (A) IF 0.2 2 ID 0.8 1.4 EX 1.6 0.6 MEM 0.4 2.5 WB 1 1.2 c) (10 pts.) The compiler has been enhanced to implement instruction reordering (but still no branch delay scheduling)? Show reordered code for performance improvement, and calculate how much average CPI reduction can be achieved with your code. d) (6 pts.) What is the total number of clock cycles that can be reduced compared to your answer in part (a) by implementing each of the following branch prediction schemes: i. Static prediction - Predict-Taken, ii. Dynamic 1-bit prediction with initial state of Predict-Not-Taken, iii. Dynamic 2-bit prediction with the below state diagram and initial state of Weakly-Predict- Not-Taken. Taken Not taken Predict Taken Predict Taken Taken Taken Not taken Not taken Predict Not Taken Predict Not Taken Not taken Taken 5. (28 pts.) The following compiled code runs on the 5-stage pipelined MIPS processor with 1.6 GHz clock speed and supply voltage of 1.2 V. RF write and read can be done in one cycle. addi $80, $zero, 1000 again: beq $80, $zero, out addi $80, $s0, -4 lw $s4, 0($sl) add $s2, $4, $s2 add $s5, $s2, $s 0 addi $sl, $80, 5 j again out: add $v0, $zero, $s2 a) (6 pts.) Assume no implementation for hazard detection, no branch delay scheduling, and the branch condition is evaluated in the Execution stage. What is the execution time, after taking into account all stalls? b) (6 pts.) Given the following characterized power parameters per pipeline stage, calculate the average power dissipation while running the given code on the processor. Assume the processor has no signal switching at any stage during stall cycles. Stage Cayn (nF) Istatic (A) IF 0.2 2 ID 0.8 1.4 EX 1.6 0.6 MEM 0.4 2.5 WB 1 1.2 c) (10 pts.) The compiler has been enhanced to implement instruction reordering (but still no branch delay scheduling)? Show reordered code for performance improvement, and calculate how much average CPI reduction can be achieved with your code. d) (6 pts.) What is the total number of clock cycles that can be reduced compared to your answer in part (a) by implementing each of the following branch prediction schemes: i. Static prediction - Predict-Taken, ii. Dynamic 1-bit prediction with initial state of Predict-Not-Taken, iii. Dynamic 2-bit prediction with the below state diagram and initial state of Weakly-Predict- Not-Taken. Taken Not taken Predict Taken Predict Taken Taken Taken Not taken Not taken Predict Not Taken Predict Not Taken Not taken Taken

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!