Question: Problem 1 (34 pts) Consider a six-stage pipeline (IF, ID, EX, M1, M2, WB) processor where memory operations take two pipeline stages, so load result

Problem 1

(34 pts) Consider a six-stage pipeline (IF, ID, EX, M1, M2, WB) processor where memory operations take two pipeline stages, so load result values are not available until after the M2 (memory) stage is completed. For this problem we will use the following assembly language sequence

myloop: 1 ADD r1, r3, r1 2 LW r1, 0(r1) 3 SW r2, 0(r1) 4 SUBI r4, r4, 4 5 LW r3, 0(r4) 6 ADD r2, r2, r1 7 BNEZ r4, myloop 8 SUB r1, r3, r1 9 OR r5, r1, r2 10 ...

Draw the pipeline timing diagram using the table below for the code sequence above.

Start with the first instruction of the loop (line 1) and draw the timing diagram through one full loop iteration including the first ADD of the second loop iteration.

Assume branches are resolved in the execution stage (EX) and branches are NOT taken.

Show all stalls with an X and all the flushes with an F .

Show the forwardings with an arrow needed to help eliminate stalls. and fill the for- warding table with source and destination pipeline registers.

Cycle (Time)

Left to the right

C1

C2

C3

C4

C5

C6

C7

C8

C9

C10

C11

C12

C13

C14

C15

C16

C17

C18

ADD

LW

SW

SUBI

LW

ADD

Bnez

.

Source Instruction (e.g. ADD r1, r3, r1)

Source Location (e.g. ID/EX)

Destination Instruction

Destination Location (e.g. ID/EX)

Due to Register (e.g. R1)

b) Now assume that the architecture uses branch delay slots to try to eliminate all the branch penalty and compiler rearranges the code for the delay slots as follows: 0 myloop: 1 ADD r1, r3, r1 2 LW r1, 0(r1) 3 SW r2, 0(r1) 4 SUBI r4, r4, 4 5 BNEZ r4, myloop 6 LW r3, 0(r4) 7 ADD r2, r2, r1 8 SUB r1, r3, r1 9 OR r5, r1, r2 10 ...

Note that instructions 6 and 7 (lines 6 and 7) are delay slot instructions which means that they will be executed even when the branch in line 5 is taken. Draw the pipeline timing diagram and fill the forwarding table for this code. Start with the first instruction of the loop (line 1) and draw the timing diagram through one full loop iteration including the first ADD of the second loop iteration.

Cycles (time)

Left to The right

C1

C2

C3

C4

C5

C6

C7

C8

C9

C10

C11

C12

C13

C14

C15

C16

C17

C18

ADD

LW

SW

SUBI

Bnez

LW

ADD

Source Instruction (e.g. ADD r1, r3, r1)

Source Location (e.g. ID/EX)

Destination Instruction

Destination Location (e.g. ID/EX)

Due to Register (e.g. R1)

c) Fill the table with RAW, WAW, and WAR dependencies for the above code example in

(b) where delay slot instructions are used.

RAW

WAR

WAW

From Instruction (Read)

To Instruction (Write)

Due to Register

From Instruction (Write)

To instruction (Read)

Due to Register

From Instruction (Write)

To Instruction (Write)

Due to Register

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!