Question: 3.3 [15] Consider a multiple-issue design. Suppose you have two execu- tion pipelines, each capable of beginning execution of one instruction per cycle, and enough

3.3 [15] Consider a multiple-issue design. Suppose you have two execu- tion pipelines, each capable of beginning execution of one instruction per cycle,

3.3 [15] Consider a multiple-issue design. Suppose you have two execu- tion pipelines, each capable of beginning execution of one instruction per cycle, and enough fetch/decode bandwidth in the front end so that it will not stall your execution. Assume results can be immediately forwarded from one execution unit to another, or to itself. Further assume that the only reason an execution pipeline would stall is to observe a true data dependency. Now how many cycles does the loop require? 3.4 [10] In the multiple-issue design of Exercise 3.3, you may have rec- ognized some subtle issues. Even though the two pipelines have the exact same instruction repertoire, they are neither identical nor interchangeable, because there is an implicit ordering between them that must reflect the ordering of the instruc- tions in the original program. If instruction N+1 begins execution in Execution Pipe 1 at the same time that instruction N begins in Pipe 0, and N+ 1 happens to require a shorter execution latency than N, then N+ 1 will complete before N (even though program ordering would have implied otherwise). Recite at least two reasons why that could be hazardous and will require special considerations in the microarchitecture. Give an example of two instructions from the code in Figure 3.47 that demonstrate this hazard. 3.5 [20] Reorder the instructions to improve performance of the code in Figure 3.47. Assume the two-pipe machine in Exercise 3.3 and that the out-of- order completion issues of Exercise 3.4 have been dealt with successfully. Just worry about observing true data dependences and functional unit latencies for now, How many cycles does your reordered code take? 3.6 [10/10/10] Every cycle that does not initiate a new operation in a pipe is a lost opportunity, in the sense that your hardware is not living up to its potential a. [10] In your reordered code from Exercise 3.5, what fraction of all cycles, counting both pipes, were wasted (did not initiate a new op)? b. [10] Loop unrolling is one standard compiler technique for finding more parallelism in code, in order to minimize the lost opportunities for perfor- mance. Hand-unroll two iterations of the loop in your reordered code from Exer cise 3.5 c. [10] What speedup did you obtain? (For this exercise, just color the N+ 1 iteration's instructions green to distinguish them from the Nth iteration's instructions; if you were actually unrolling the loop, you would have to reassign registers to prevent collisions between the iterations.) 3.3 [15] Consider a multiple-issue design. Suppose you have two execu- tion pipelines, each capable of beginning execution of one instruction per cycle, and enough fetch/decode bandwidth in the front end so that it will not stall your execution. Assume results can be immediately forwarded from one execution unit to another, or to itself. Further assume that the only reason an execution pipeline would stall is to observe a true data dependency. Now how many cycles does the loop require? 3.4 [10] In the multiple-issue design of Exercise 3.3, you may have rec- ognized some subtle issues. Even though the two pipelines have the exact same instruction repertoire, they are neither identical nor interchangeable, because there is an implicit ordering between them that must reflect the ordering of the instruc- tions in the original program. If instruction N+1 begins execution in Execution Pipe 1 at the same time that instruction N begins in Pipe 0, and N+ 1 happens to require a shorter execution latency than N, then N+ 1 will complete before N (even though program ordering would have implied otherwise). Recite at least two reasons why that could be hazardous and will require special considerations in the microarchitecture. Give an example of two instructions from the code in Figure 3.47 that demonstrate this hazard. 3.5 [20] Reorder the instructions to improve performance of the code in Figure 3.47. Assume the two-pipe machine in Exercise 3.3 and that the out-of- order completion issues of Exercise 3.4 have been dealt with successfully. Just worry about observing true data dependences and functional unit latencies for now, How many cycles does your reordered code take? 3.6 [10/10/10] Every cycle that does not initiate a new operation in a pipe is a lost opportunity, in the sense that your hardware is not living up to its potential a. [10] In your reordered code from Exercise 3.5, what fraction of all cycles, counting both pipes, were wasted (did not initiate a new op)? b. [10] Loop unrolling is one standard compiler technique for finding more parallelism in code, in order to minimize the lost opportunities for perfor- mance. Hand-unroll two iterations of the loop in your reordered code from Exer cise 3.5 c. [10] What speedup did you obtain? (For this exercise, just color the N+ 1 iteration's instructions green to distinguish them from the Nth iteration's instructions; if you were actually unrolling the loop, you would have to reassign registers to prevent collisions between the iterations.)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

just answer the given question using the references given Every cycle that does not initiate a new operation in a pipe is a lost opportunity, in the sense that your hardware is not living up to its...

Reorder the instructions to improve performance of the code in Figure 3.47. Assume the two-pipe machine in Exercise 3.3 and that the out-oforder completion issues of Exercise 3.4 have been dealt with...

Problem #1 Consider a multiple-issue design. Suppose you have two execution pipelines, each capable of beginning execution of one instruction per cycle,and enough fetch/decode bandwidth in the front...

Question 1 You are tasked with designing a new processor microarchitecture, and you are trying to figure out how best to allocate your hardware resources. Which of the hardware and software...

Consider a multiple-issue design. Suppose you have two execution pipelines, each capable of beginning execution of one instruction per cycle, and enough fetch/decode bandwidth in the front end so...

a ) What would be the baseline performance ( in cycles, per loop iteration ) of the code sequence in the table below, if no new instruction's execution could be initiated until the previous...

Consider amultiple - issue design. Suppose you have two execution pipelines, each capable of beginning execution of one instruction per cycle, and enough fetch / decode bandwidth in the front end so...

Provide a summary technical report with your own words about Pipelined Execution which is also named as Instruction Level Parallelism, addressing mainly the following areas: 1. What is Pipelined...

Supply Chain Management Introduction Outline What is supply chain management? Significance of supply chain management. Push vs. Pull processes utdallas.edu/~metin 1 A Generic Supply Chain Sources:...

Computer Organization and Networks Practicals 2021/22 October 9, 2021 Computer Organization and Networks Practicals 2021/22 b68495714b Contents Contents 0 Introduction 3 0.1 Registration . . . . . ....

Mountain High Bikes (MHB) wants to implement a balanced scorecard. Its mission statement reads, We build high-quality, reliable bikes at competitive prices. The companys competitive strategy is to...

Patricia was a doctor who was treating Mr. Hanson during his terminal illness. She had become close to Mr. Hanson because of the amount of time they spend together. One day, Mr. Hanson agreed to sell...

Assume that the British pound and the Swiss franc are highly correlated. A US firm anticipated the equivalent of $ 1 million cash outflows in francs and the equivalent of $ 1 million cash outflows in...

On June 30, 2021, Georgia-Atlantic, Inc. leased warehouse equipment from Builders, Inc. The lease agreement calls for Georgia-Atlantic to make semiannual lease payments of $486,146 over a 4-year...

13-4 What are alternative methods for building information systems?

13-2 What are the core activities in the systems development process?

13-1 How does building new systems produce organizational change?