Question: 1. Loop Unrolling [44 marks] Consider the following loop: loop: 1.d f4,0(r1) 11 1.d f6,0 (r2) 12 mul.d f6, f6,f2 m2 add.d f4, f4, f6

 1. Loop Unrolling [44 marks] Consider the following loop: loop: 1.d

1. Loop Unrolling [44 marks] Consider the following loop: loop: 1.d f4,0(r1) 11 1.d f6,0 (r2) 12 mul.d f6, f6,f2 m2 add.d f4, f4, f6 al s.d f4,0(r1) s1 subi r1,r1,8sub1 subi r2,r2,8 sub2 bnez rl,loopbr Note: Our convention is that FP arithmetics have 4 x-boxes a) [6 marks] Using the names '11' to s1' for the first six instructions in the body of this loop, draw the flow-dependence graph for just these instructions. Label each arrow with the dependence gap between the producer and the consumer. In what follows, focus on three flow-dependence types: i) FP arith to FP arith, ) FP arith to FP store, and ) FP load to FP arith . Denote the number of m-boxes in memory references by '#m, and the number of x-boxes in FP arithmetics by .#x'. b) [6 marks] For each of the three designated flow-dependence types, indicate the number of stalls in adjacent producer-consumer pairs as functions of '#m ' and '#x'. c) [10 marks] Suppose #m 1 and #x 4. How many stalls occur in one iteration of the loop if it is executed exactly as written? d) [10 marks] Unroll the loop twice. If one reschedules the unrolled loop optimally, how many stalls are left? (Keep the branch as the last instruction, but feel free to pull up any 'subi' instructions for gap padding. Show the rescheduled code using the _short_ names.) e) [8 marks] Five instructions in the original code contain immediates After unrolling the loop twice, some immediates may change Show all instructions with correct immediates in the unrolled code f) [4 marks] According to the strict definitions of data dependences, there would be name dependences in the unrolled loop in the absence of register renaming. Do you think register renaming is really required? ExpIain. 1. Loop Unrolling [44 marks] Consider the following loop: loop: 1.d f4,0(r1) 11 1.d f6,0 (r2) 12 mul.d f6, f6,f2 m2 add.d f4, f4, f6 al s.d f4,0(r1) s1 subi r1,r1,8sub1 subi r2,r2,8 sub2 bnez rl,loopbr Note: Our convention is that FP arithmetics have 4 x-boxes a) [6 marks] Using the names '11' to s1' for the first six instructions in the body of this loop, draw the flow-dependence graph for just these instructions. Label each arrow with the dependence gap between the producer and the consumer. In what follows, focus on three flow-dependence types: i) FP arith to FP arith, ) FP arith to FP store, and ) FP load to FP arith . Denote the number of m-boxes in memory references by '#m, and the number of x-boxes in FP arithmetics by .#x'. b) [6 marks] For each of the three designated flow-dependence types, indicate the number of stalls in adjacent producer-consumer pairs as functions of '#m ' and '#x'. c) [10 marks] Suppose #m 1 and #x 4. How many stalls occur in one iteration of the loop if it is executed exactly as written? d) [10 marks] Unroll the loop twice. If one reschedules the unrolled loop optimally, how many stalls are left? (Keep the branch as the last instruction, but feel free to pull up any 'subi' instructions for gap padding. Show the rescheduled code using the _short_ names.) e) [8 marks] Five instructions in the original code contain immediates After unrolling the loop twice, some immediates may change Show all instructions with correct immediates in the unrolled code f) [4 marks] According to the strict definitions of data dependences, there would be name dependences in the unrolled loop in the absence of register renaming. Do you think register renaming is really required? ExpIain

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!