( 1 5 points ) Pipeline Processing Time In this exercise, we examine how pipelining affects the clock cycle time of the processor Problems in this exercise assume that individual stages of the data path have the following latencies, and the instructions are broken out by percentage of the type a ) What is the clock cycle time in a pipelined and non pipelined processor b ) What is the total latency of an LW instruction in a pipelined and non pipelined processor c ) If we can split one stage of the pipelined data path into two new stages, each with half the latency of the original stage, which stage would you split and what is the new clock cycle time of the processor d ) Assuming there are no stalls or hazards, what is the utilization of the data memory e ) Assuming there are no stalls or hazards, what is the utilization of the write register port of the Register Library unit f ) If we split one stage of the data path into two new stages, each with half the latency of the original stage, which stage would you split and what is the new clock cycle time of the pipelined and non pipelined processor What would be the speed up of the pipelined and non pipelined processor when this new data path is implemented g ) If the control system has dedicated ALUs and with the new data path included, an alternative general ALU is available that will run additions ( and subtractions ) in half the time ( ex 1 2 0 ps vs 2 4 0 ps ) but non addition ALU operations now take a stall as processing time overhead What would be the total speedup with all improvements of the pipelined system if 8 0 of ALU operations were additions h ) What alternatives would there be on how to use this new ALU on the non pipelined system and how would it affect it 2 ) ( 5 points ) Pipeline Design What is the minimum number of cycles needed to completely execute n instructions on a CPU with a k stage pipeline Justify your formula 3 ) ( 1 0 points ) Data Hazards Assume that register $s 0 is initialized to 1 1 and $s 1 is initialized to 2 2 Suppose you executed the code below on the pipeline from class ( stages IF , ID , EX , MEM, WB ) that does not handle data hazards ( i e , the programmer is responsible for addressing data hazards by inserting NOP instructions where necessary ) addi $s 0 , $s 1 , 5 add $s 2 , $s 0 , $s 1 addi $s 3 , $s 0 , 1 5 add $s 4 , $s 3 , $s 0 a ) What would the final values of registers $s 2 , $s 3 and $s 4 be if run as is b ) Rewrite the code segment and add NOP instructions so that it will run correctly on a pipeline that does not handle data hazards 4 ) ( 1 5 points ) Hazard Control Costs Consider a version of the pipeline that does not handle data hazards ( i e , the programmer compiler is responsible for addressing data hazards by inserting NOP instructions where necessary ) Suppose that ( after optimization ) a typical n instruction program requires an additional 4 n NOP instructions to correctly handle data hazards a ) Suppose that the cycle time of this pipeline without forwarding is 2 5 0 ps Suppose also that adding forwarding hardware will reduce the number of NOPs from 4 n to 0 5 n , but increase the cycle time to 3 0 0 ps What is the speedup of this new pipeline compared to the one without forwarding b ) Different programs will require different amounts of NOPs How many NOPs ( as a percentage of code instructions ) can remain in the typical program before that program runs slower on the pipeline with forwarding c ) Now instead of 4 additional NOP s , let that additional amount be unknown x What is the resulting formula for determining the maximum number of NOP instructions before the pipeline is slower than the non pipelined machine d ) Can a program with only 0 7 5 n NOPs possibly run faster on the pipeline with forwarding Explain why or why not e ) At minimum, how many NOPs ( as a percentage of code instructions ) must a program have before it can possibly run faster on the pipeline with forwarding 5 ) ( 2 0 points ) Structural Hazards Consider the fragment of MIPS assembly below sw $s 5 , 1 2 ( $s 3 ) lw $s 5 , 8 ( $s 3 ) sub $s 4 , $s 2 , $s 1 beq $s 4 , $zero, label add $s 2 , $s 0 , $s 1 sub $s 2 , $s 6 , $s 1 Suppose we modify the pipeline so that it has only one memory ( that handles both instructions and data ) In this case, there will be a structural hazard every time a program needs to fetch an instruction during the same cycle in which another instruction accesses data

The Answer is in the image, click to view ...

Question: ( 1 5 points ) [ Pipeline Processing Time ] In this exercise, we examine how pipelining affects the clock cycle time of the processor.

(15

points

) [

Pipeline Processing Time

]

In this exercise, we examine how pipelining affects the clock cycle time of the processor.

Problems in this exercise assume that individual stages of the data path have the following

latencies, and the instructions are broken out by percentage of the type:

)

What is the clock cycle time in a pipelined and non

-

pipelined processor?

)

What is the total latency of an LW instruction in a pipelined and non

-

pipelined processor?

)

If we can split one stage of the pipelined data path into two new stages, each with half the

latency of the original stage, which stage would you split and what is the new clock cycle

time of the processor?

)

Assuming there are no stalls or hazards, what is the utilization of the data memory?

)

Assuming there are no stalls or hazards, what is the utilization of the write

-

the "Register Library" unit?

)

If we split one stage of the data path into two new stages, each with half the latency of the

original stage, which stage would you split and what is the new clock cycle time of the

pipelined and non

-

pipelined processor? What would be the speed up of the pipelined and

non

-

pipelined processor when this new data path is implemented?

)

If the control system has dedicated ALUs and with the new data path included, an

alternative general ALU is available that will run additions

(

and subtractions

)

in half the

time

(

ex:

120

ps vs

. 240

)

but non

-

addition ALU operations now take a stall as

processing time overhead. What would be the total speedup with all improvements of the

pipelined system if

80 %

of ALU operations were additions?

)

What alternatives would there be on how to use this new ALU on the non

-

pipelined

system and how would it affect it

?

2) (5

points

) [

Pipeline Design

]

What is the minimum number of cycles needed to completely execute n instructions on a

CPU with a k stage pipeline? Justify your formula.

3) (10

points

) [

Data Hazards

]

Assume that register $s

0

is initialized to

11

and $s

1

is initialized to

22 .

Suppose you

executed the code below on the pipeline from class

(

stages IF

,

,

,

MEM, WB

)

that does

not handle data hazards

(

.

.,

the programmer is responsible for addressing data hazards by

inserting NOP instructions where necessary

) .

addi $s

0,

1, 5

add $s

2,

0,

1

addi $s

3,

0, 15

add $s

4,

3,

0

)

What would the final values of registers $s

2,

3

and $s

4

be if run as is

?

)

Rewrite the code segment and add NOP instructions so that it will run correctly on a

pipeline that does not handle data hazards.

4) (15

points

) [

Hazard Control Costs

]

Consider a version of the pipeline that does not handle data hazards

(

.

.,

the

programmer

/

compiler is responsible for addressing data hazards by inserting NOP

instructions where necessary

) .

Suppose that

(

after optimization

)

a typical n

-

instruction

program requires an additional

. 4 *

n NOP instructions to correctly handle data hazards.

)

Suppose that the cycle time of this pipeline without forwarding is

250

.

Suppose also

that adding forwarding hardware will reduce the number of NOPs from

. 4 *

n to

. 05 *

,

but

increase the cycle time to

300

.

What is the speedup of this new pipeline compared to

the one without forwarding?

)

Different programs will require different amounts of NOPs. How many NOPs

(

as a

percentage of code instructions

)

can remain in the typical program before that program

runs slower on the pipeline with forwarding?

)

Now instead of

. 4

additional NOP

,

let that additional amount be unknown x

.

What is

the resulting formula for determining the maximum number of NOP instructions before

the pipeline is slower than the non

-

pipelined machine?

)

Can a program with only

. 075 *

n NOPs possibly run faster on the pipeline with

forwarding? Explain why or why not.

)

At minimum, how many NOPs

(

as a percentage of code instructions

)

must a program

have before it can possibly run faster on the pipeline with forwarding?

5) (20

points

) [

Structural Hazards

]

Consider the fragment of MIPS assembly below:

sw $s

5, 12 (

3)

lw $s

5, 8 (

3)

sub $s

4,

2,

1

beq $s

4,

$zero, label

add $s

2,

0,

1

sub $s

2,

6,

1

Suppose we modify the pipeline so that it has only one memory

(

that handles both

instructions and data

) .

In this case, there will be a structural hazard every time a program

needs to fetch an instruction during the same cycle in which another instruction accesses

data.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Assume that in order to achieve reliable data transfer, we use negative acknowledge- ments (NAKs) instead of acknowledgements (ACKs). The protocol works as follows: When receiver detects a gap in the...

Exercise 4.10 In this exercise we examine how the clock cycle time of the processor affects the design of the control unit, and vice versa. Problems in this exercise assume that the logic blocks used...

a . Suppose we have a processor with a base CPI of 2 . 0 , and a clock rate of 5 GHz assuming all references hit in the primary cache. Assume a main memory access time of 1 0 0 ns , including all the...

a Suppose we have a processor with a base C P I of 2 . 0 , and a clock rate of 5 GHz assuning all. references nt in the primary cache Assume a main memory access time of 1 0 0 ns , including ail the...

To avoid lengthening the critical path of the datapath shown in Figure 4.24, how much time can the control unit take to generate the MemWrite signal? In this exercise we examine how the clock cycle...

Which control signal in Figure 4.24 is the most critical to generate quickly and how much time does the control unit have to generate it if it wants to avoid being on the critical path? In this...

Which control signal in Figure 4.24 has the most slack and how much time does the control unit have to generate it if it wants to avoid being on the critical path? In this exercise we examine how the...

Provide a summary technical report with your own words about Pipelined Execution which is also named as Instruction Level Parallelism, addressing mainly the following areas: 1. What is Pipelined...

Provide a summary technical report with about Pipelined Execution which is also named as Instruction Level Parallelism, addressing mainly the following areas: 1. What is Pipelined Execution and its...

Answer the following quetions its maketing Not all companies segment the market, and this is called aggregation. They see the market as a whole and offer their products to everyone. What kind of...

Now that you learned about the four strategies of principled bargaining, come up with an example of a bargaining situation in which you were involved (or that you are familiar with), and connect one...

This week we covered Chapter 9, Evaluating Decentralized Operations and Chapter 10, Differential Analysis and Product Pricing. Decentralized Operations is when decision making and authority is...

When joint and several liability exists a. The Internal Revenue Service could collect the entire tax debt from only one spouse b. Both spouses can be liable for the entire tax debt until it is paid...

Suppose Suppose that the Cobb-Douglas production function is q=L34K14 . Select the correct answer. 0 A The average product of labor is (LK)14 , and the marginal product of labor is 34(KL)14 B The...