Question: Consider a program comprising 62% arithmetic instructions, 16% load instruction, 8% store instructions, and 14% branch instructions. Assume the CPI for all instructions is 2,
- Consider a program comprising 62% arithmetic instructions, 16% load instruction, 8% store instructions, and 14% branch instructions. Assume the CPI for all instructions is 2, except for branches which have a CPI of 3.
- Suppose we consider two alternate improvements to a processor. P1 will execute arithmetic instructions 2 times faster. P2 will execute branches 3 times faster and both load and store instructions 2 times faster. In each case, other instructions are unaffected by the changes. Which is faster, P1 or P2, and by how much?
- Consider running the program on a machine with a large graphics card. When we run the program on this machine, the arithmetic instructions only can be run in parallel on the card, everything else is run sequentially. As the number of stream processors on the GPU goes toward infinity, what is the maximum speedup obtainable on this program?
- In the previous problem, the GPU improvement was applied toward arithmetic instructions. Assuming that you could apply the GPU to any one instruction class, this is still the smartest choice. What Great Idea in Computer Architecture is this an example of?
- Assuming we can apply the GPU improvement to both load and store instructions. With infinitely many streaming GPU processors, is it possible to make the program run two times faster? Explain.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
