Question: Assignment 2 : Processors, Pipelines, and Instruction Representation 1 5 points + 3 bonus Task 1 : Pipelined Architecture Assume you are programming in assembly

Assignment

2

: Processors, Pipelines, and Instruction Representation

15

points

+ 3

bonus

Task

1

: Pipelined Architecture

Assume you are programming in assembly on a

16 -

bit architecture which employs a five

-

stage pipeline for instruction execution. The five stages are:

(1)

fetch next instruction

(2)

decode instruction and fetch operands

(3)

perform ALU operation

(4)

read or write to memory

(5)

store result in register

Your architecture has

8

general purpose registers split into two banks with r

0 -

3

on bank

1

and

\ (\

mathrm

{

4} - \

mathrm

{

} 7 \)

on baink

2 .

The following instructions are implemented:

load loads an immediate value into a register

\ (

_{

} \) (\ (

_{

} \)

and the immediate value offset are provided as arguments

)

mov moves a value from a register

\ (

_{

} \)

to a register

\ (

_{

} \

left

(

_{

} =

_{

} \

right

) \)

add adds the value in register

\ (

_{

} \)

to the value in register

\ (

_{

} \)

and stores the result in register

\ (

_{

} \

left

(

_{

} =

_{

} \

right

. \) \ (+ \

mathrm

{

}_{\

mathrm

{

}} \))

sub subtracts the value in register

\ (

_{

} \)

from the value in register

\ (

_{

} \)

and stores the result in register

\ (

_{

} \), \ (\

left

(

_{

} =

_{

} -

_{

} \

right

) \)

emp compares the value of registers

\ (

_{

} \)

and

\ (

_{

} \)

and sets the global register cmp to I if

\ (

_{2} \

geq r

_{

} \)

and to

0

otherwise.

bae increments the program counter pe by an immediate value if the value of the global register cmp is not equal to

1 (\ (\

mathrm

{

} = \)

petoffset

)

jmp sets the program counter pe to a value

\ (

_{3} \)

toffset and moves on to the next instruction.

a: Assuming that each stage of the pipeline takes three clock cycles to complete, how many instructions per clock cycle does the overall architecture execute? explain under what conditions

your answer holds true. What would be the consequence of improving stages

(1) . (2)

and

(3)

to process one instruction per cycle?

(3

points

)

b: At which stage of the pipeline does the parallelism offered by the register banks become uscful and why?

(1

point

)

c: For which instructions in the ISA does the use of register banks speed up execution?

(1

point

)

You have written the following assembly code:

load r

0, 10

load value

10

into register ro

load

\ (\

mathrm

{

}, \

mathrm

{

} \) \

# load

1

into register rl

load

\ (

2, 0 \) \

# load

0

into register

\ (

2 \)

load

\ (

3, 0 \

quad

\) \

# load

0

into register r

3

load

\ (\

mathrm

{

} 4, 1 \

quad

\) \

# load

1

into register r

4

load rs

, 0 \

# load

0

into rs

while :

add

\ (\

mathrm

{

}, \

mathrm

{

} 2 \

quad

otin

\)

add contents of rl and r

2,

store into r

2

mov r

4,

5

ie r

5 -

4

add

\ (\

mathrm

{

} 3, \

mathrm

{

} 4 \)

\ (\

mathrm

{

4} - \

mathrm

{

} 3 + \

mathrm

{

4} \)

mov r

5,

3 \

\ (\

mathrm

{

} 3 = \

mathrm

{

} 5 \)

cmpr

2,

0 \ (\

quad

\) \

\ (\

mathrm

{

cmp

} = \

mathrm

{

} 2 > - \

mathrm

{

} 0 \)

bne while

d: Re

-

assign register numbers to use the register banks correctly and avoid register conflicts.

(2

points

)

e: Assume that register bank conflicts are not a problem

(

'

ve solved this somehow

)

and look at the original code above

(

not your modifications from d:

) .

How many pipeline stalls due to data hazards will occur while executing this program? Assume that no control hazards occur and that the pipeline is always fed with the next instruction from the correct branch.

(3

points

)

.

Design a binary encoding for the instruction set. What type of operand encoding are you using? What is the largest offset that you can encode?

(3

points

)

g: The instruction set does not allow for a multiplication operation. Assume we want to use this processor to emulate one which provides a multiplication instruction mul

\ (

_{

}

_{

} \)

which computes the expression

\ (

_{3} =

_{1} \

times r

_{

} \) .

Write an assembly code routine to implement this instruction. You can assume that the jump and boe instructions can be called with a label, instead of an explicit offect. Note: here we are using the pseudo assembly ISA defined above and not some real assembly language. Therefore, I expect the solution to be provided in pseudocode.

(5

points

)

Assignment 2 : Processors, Pipelines, and

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Supply Chain Management Introduction Outline What is supply chain management? Significance of supply chain management. Push vs. Pull processes utdallas.edu/~metin 1 A Generic Supply Chain Sources:...

Task 1 : Pipelined Architecture Assume you are programming in assembly on a 1 6 - bit architecture which employs a five - stage pipeline for instruction execution. The five stages are: ( 1 ) fetch...

Provide a summary technical report with about Pipelined Execution which is also named as Instruction Level Parallelism, addressing mainly the following areas: 1. What is Pipelined Execution and its...

Provide a summary technical report with your own words about Pipelined Execution which is also named as Instruction Level Parallelism, addressing mainly the following areas: 1. What is Pipelined...

ASSEMBLY Work Task Part 1 Now, load the stack using a loop with the values shown below. After the stack is loaded, use index mode (indexing using the SP register) to load R10 with 0x0000.0044 and R11...

Computer Organization and Networks Practicals 2021/22 October 9, 2021 Computer Organization and Networks Practicals 2021/22 b68495714b Contents Contents 0 Introduction 3 0.1 Registration . . . . . ....

1 of 6 Assign-02: Writing a Linux Utility (encodeInput) Description This assignment has you writing a different utility for Linux. This utility (officially called a filter in UNIX / Linux) could be...

Question: Check you are in charge of the design of both hardware and software for a new (but fairly conventional) workstation which will have its peripherals (for example a disc drive and a printer)...

can someone solve this Modern workstations typically have memory systems that incorporate two or three levels of caching. Explain why they are designed like this. [4 marks] In order to investigate...

(a) In SystemVerilog, what is the difference between: (i) The ternary operator ? and if...then...else statements? [2 marks] (ii) always_ff and always_comb? [2 marks] (iii) Blocking, non-blocking and...

(a) Give the complete electron configuration (1S 2 2S 2 2p. . .) of aluminum in the ground state. (b) The wavelength of the radiation emitted when the outermost electron of aluminum falls from the 4s...

On December 1, 2020, ABC Company has a cash balance of OMR 50,188, Accounts Receivable balance of OMR 40,188 and Allowance for Doubtful Accounts of OMR 2,000 before adjustment. The following...

How much should we factor in future generations when making financial decisions today

According to Chapter 7 (The Role of AI in Banking) in Bank 4.0, what advantage do tech majors have over banks when it comes to AI? Question 42 options: 1) Capital and technology pedigree 2) Talent mor