Question: You are given the following DAXPY loop, which computes the operation Y = aX + YY = aX + YY = aX + Y for

You are given the following DAXPY loop, which computes the operation Y=aX+YY = aX + YY=aX+Y for a vector of length 100, where aaa is a scalar, and X and Y are vectors. The loop is implemented with the following assembly code:
scss
Copy code
DADDIU R4, R1, #800 ; R1= upper bound for X L.D F2,0(R1) ; F2= X(i) MUL.D F4, F2, F0 ; F4= a * X(i) L.D F6,0(R2) ; F6= Y(i) ADD.D F6, F4, F6 ; F6= a * X(i)+ Y(i) S.D 0(R2), F6 ; Store Y(i) DADDIU R1, R1, #8 ; Increment X index DADDIU R2, R2, #8 ; Increment Y index DSLTU R3, R1, R4 ; Test: continue loop? BNEZ R3, foo ; Loop if needed
Assume the following:
The functional unit latencies are given as:
FP multiply: 6 cyclesFP add: 3 cyclesFP store: 2 cyclesInteger operations and loads: 2 cycles
Results are fully bypassed.
The branch has a 1-cycle delay and resolves in the ID stage.
Tasks:
Unroll the loop as many times as necessary to schedule it without stalls, collapsing the loop overhead instructions.
Provide the instruction schedule.
Determine the execution time per element of the result.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!