Question: ( 1 5 points ) [ Pipeline Processing Time ] In this exercise, we examine how pipelining affects the clock cycle time of the processor.

(15points)[Pipeline Processing Time]
In this exercise, we examine how pipelining affects the clock cycle time of the processor.
Problems in this exercise assume that individual stages of the data path have the following
latencies, and the instructions are broken out by percentage of the type:
a) What is the clock cycle time in a pipelined and non-pipelined processor?
b) What is the total latency of an LW instruction in a pipelined and non-pipelined processor?
c) If we can split one stage of the pipelined data path into two new stages, each with half the
latency of the original stage, which stage would you split and what is the new clock cycle
time of the processor?
d) Assuming there are no stalls or hazards, what is the utilization of the data memory?
e) Assuming there are no stalls or hazards, what is the utilization of the write-register port of
the "Register Library" unit?
f) If we split one stage of the data path into two new stages, each with half the latency of the
original stage, which stage would you split and what is the new clock cycle time of the
pipelined and non-pipelined processor? What would be the speed up of the pipelined and
non-pipelined processor when this new data path is implemented?
g) If the control system has dedicated ALUs and with the new data path included, an
alternative general ALU is available that will run additions (and subtractions) in half the
time (ex: 120ps vs.240ps) but non-addition ALU operations now take a stall as
processing time overhead. What would be the total speedup with all improvements of the
pipelined system if 80% of ALU operations were additions?
h) What alternatives would there be on how to use this new ALU on the non-pipelined
system and how would it affect it?
2)(5 points)[Pipeline Design]
What is the minimum number of cycles needed to completely execute n instructions on a
CPU with a k stage pipeline? Justify your formula.
3)(10 points)[Data Hazards]
Assume that register $s0 is initialized to 11 and $s1 is initialized to 22. Suppose you
executed the code below on the pipeline from class (stages IF, ID, EX, MEM, WB) that does
not handle data hazards (i.e., the programmer is responsible for addressing data hazards by
inserting NOP instructions where necessary).
addi $s0, $s1,5
add $s2, $s0, $s1
addi $s3, $s0,15
add $s4, $s3, $s0
a) What would the final values of registers $s2, $s3 and $s4 be if run as is?
b) Rewrite the code segment and add NOP instructions so that it will run correctly on a
pipeline that does not handle data hazards.
4)(15 points)[Hazard Control Costs]
Consider a version of the pipeline that does not handle data hazards (i.e., the
programmer/compiler is responsible for addressing data hazards by inserting NOP
instructions where necessary). Suppose that (after optimization) a typical n-instruction
program requires an additional .4*n NOP instructions to correctly handle data hazards.
a) Suppose that the cycle time of this pipeline without forwarding is 250 ps. Suppose also
that adding forwarding hardware will reduce the number of NOPs from .4*n to .05*n, but
increase the cycle time to 300 ps. What is the speedup of this new pipeline compared to
the one without forwarding?
b) Different programs will require different amounts of NOPs. How many NOPs (as a
percentage of code instructions) can remain in the typical program before that program
runs slower on the pipeline with forwarding?
c) Now instead of .4 additional NOPs, let that additional amount be unknown x. What is
the resulting formula for determining the maximum number of NOP instructions before
the pipeline is slower than the non-pipelined machine?
d) Can a program with only .075*n NOPs possibly run faster on the pipeline with
forwarding? Explain why or why not.
e) At minimum, how many NOPs (as a percentage of code instructions) must a program
have before it can possibly run faster on the pipeline with forwarding?
5)(20 points)[Structural Hazards]
Consider the fragment of MIPS assembly below:
sw $s5,12($s3)
lw $s5,8($s3)
sub $s4, $s2, $s1
beq $s4, $zero, label
add $s2, $s0, $s1
sub $s2, $s6, $s1
Suppose we modify the pipeline so that it has only one memory (that handles both
instructions and data). In this case, there will be a structural hazard every time a program
needs to fetch an instruction during the same cycle in which another instruction accesses
data.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!