Question: 6 . 1 6 . 4 : [ 1 0 ] . help _ outline Consider the following piece of C code: for ( j

6.16.4: [10].
help_outline
Consider the following piece of C code:
for (j =2; j <=1000; j++) D[j]= D[j 1]+ D[j 2];
The ARMv8 code corresponding to the above fragment is:
MOV X10 #8000 ADD X2, X0, X10 ADDI X1, X0, #16 LOOP: LDUR D0,[X1, #-16] LDUR D2,[X1, #-8] FADDD D4, D0, D2 STUR D4,[X1, #0] ADDI X1, X1, #8 CMP X1, X2 B.LE LOOP
The latency of an instruction is the number of cycles that must come between that instruction and an instruction using the result. Assume floating point instructions have the following associated latencies (in cycles):
(a)
How many cycles does it take to execute this code?
(b)
When an instruction in a later iteration of a loop depends upon a data value produced in an earlier iteration of the same loop, we say that there is a loop-carried dependence between iterations of the loop. Identify the loop-carried dependences in the above code. Identify the dependent program variable and assembly-level registers. You can ignore the loop induction variable j.instruction.)
(c)
Rewrite the code by using registers to carry the data between iterations of the loop (as opposed to storing and re-loading the data from main memory). Show where this code stalls and calculate the number of cycles required to execute.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!