(1) Assume the outcome of branch instruction is correctly predicted. (2) Assume there is an integer...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
(1) Assume the outcome of branch instruction is correctly predicted. (2) Assume there is an integer ALU for address calculation; and another integer ALU for branch and all other integer operations. (3) If the first instruction in an issue packet is a branch instruction, only this branch instruction can be issued in this cycle. (4) Up to two instructions can be committed per cycle. (5) There are two CDBS. (6) For load/store, EX is for address calculation. (7) Only show the first two iterations and ignore the addi instruction before the loop. (8) The functional units (FUs) are pipelined and with latency described in the table below. FU Type Cycles in EX Number of FUs Number of reservation stations Integer 1 2 FP adder 12 1 3 FP multiplier 18 1 Q1-a) With speculation; there are twelve Reorder Buffer (ROB) entries. Q1-b) Without speculation; [25/25/25] <3.2, 3.7> In this exercise, we look at how software techniques can extract instruction-level parallelism (ILP) in a common vector loop. The following loop is the so-called DAXPY loop (double-precision aX plus Y) and is the central operation in Gaussian elimination. The following code implements the DAXPY operation, Y=aX + Y, for a vector length 100. Initially, R1 is set to the base address of array X and R2 is set to the base address of Y: 3.14 addi x4,x1,#800 ; x1 = upper bound for X Case Studies and Exercises by Jason D. Bakos and Robert P. Colwell 275 ; (F2) = X(i) ; (F4) = a*X(i) ; (F6) = Y (i) ; (F6) = a*X(i) + Y(i) ; Y(i) = a*X(i) +Y(i) ; incrementX index ; increment Y index ; test: continue loop? ; loop if needed foo: fld F2,0(x1) F4, F2, FO F6,0(x2) F6, F4, F6 F6,0(x2) x1,x1,#8 x2, x2,#8 x3, x1,x4 x3, foo fmul.d fld fadd.d fsd addi addi sltu bnez Assume the functional unit latencies as shown in the following table. Assume a one-cycle delayed branch that resolves in the ID stage. Assume that results are fully bypassed. Instruction producing result FP multiply Instruction using result FP ALU op Latency in clock cycles 6. FP add FP ALU op 4 FP multiply FP store FP add FP store 4 Integer operations and all loads Any (1) Assume the outcome of branch instruction is correctly predicted. (2) Assume there is an integer ALU for address calculation; and another integer ALU for branch and all other integer operations. (3) If the first instruction in an issue packet is a branch instruction, only this branch instruction can be issued in this cycle. (4) Up to two instructions can be committed per cycle. (5) There are two CDBS. (6) For load/store, EX is for address calculation. (7) Only show the first two iterations and ignore the addi instruction before the loop. (8) The functional units (FUs) are pipelined and with latency described in the table below. FU Type Cycles in EX Number of FUs Number of reservation stations Integer 1 2 FP adder 12 1 3 FP multiplier 18 1 Q1-a) With speculation; there are twelve Reorder Buffer (ROB) entries. Q1-b) Without speculation; [25/25/25] <3.2, 3.7> In this exercise, we look at how software techniques can extract instruction-level parallelism (ILP) in a common vector loop. The following loop is the so-called DAXPY loop (double-precision aX plus Y) and is the central operation in Gaussian elimination. The following code implements the DAXPY operation, Y=aX + Y, for a vector length 100. Initially, R1 is set to the base address of array X and R2 is set to the base address of Y: 3.14 addi x4,x1,#800 ; x1 = upper bound for X Case Studies and Exercises by Jason D. Bakos and Robert P. Colwell 275 ; (F2) = X(i) ; (F4) = a*X(i) ; (F6) = Y (i) ; (F6) = a*X(i) + Y(i) ; Y(i) = a*X(i) +Y(i) ; incrementX index ; increment Y index ; test: continue loop? ; loop if needed foo: fld F2,0(x1) F4, F2, FO F6,0(x2) F6, F4, F6 F6,0(x2) x1,x1,#8 x2, x2,#8 x3, x1,x4 x3, foo fmul.d fld fadd.d fsd addi addi sltu bnez Assume the functional unit latencies as shown in the following table. Assume a one-cycle delayed branch that resolves in the ID stage. Assume that results are fully bypassed. Instruction producing result FP multiply Instruction using result FP ALU op Latency in clock cycles 6. FP add FP ALU op 4 FP multiply FP store FP add FP store 4 Integer operations and all loads Any
Expert Answer:
Answer rating: 100% (QA)
Answer ANSWER Q1a Correct Speculation void stallresult is computed earlyperfprmance Speculation accu... View the full answer
Related Book For
Cornerstones of Financial and Managerial Accounting
ISBN: 978-1111879044
2nd edition
Authors: Rich, Jeff Jones, Dan Heitger, Maryanne Mowen, Don Hansen
Posted Date:
Students also viewed these computer engineering questions
-
a. In a 4 GHz pipelined processor, ALU and Branch operations take 4 cycles while Memory operations take 5 cycles. If a program has 40% ALU, 20% Branch, and 40% Memory operations, what is the average...
-
A PC-relative mode branch instruction is stored in memory at address 62010. The branch is made to location 53010.The address field in the instruction is 10 bits long. What is the binary value in the...
-
Below are twelve instructions. Which are best described as planning, and which are best described as forecasting? a. Give a complete definition of the work. b. Lay out a proposed schedule. c....
-
True Or False Death benefits are used to compensate the deceaseds family for pain and suffering.
-
What is the mass of 33.7 mol of H2O?
-
What is the difference between a defined benefit plan and a defined contribution plan?
-
If the spare part of a cement mixer has the Weibull failure-time distribution with the parameters \(\alpha=\) 0.001 per hour and \(\beta=0.65\), find the probability that it will not operate...
-
A friend of yours can invest in a multiyear project. The cost is $14,000.Annual cash flows are estimated to be $5,000 per year for six years but could vary between $2,500 and $7,000. Your friend...
-
Select a company you are familiar with and detail one short-term and one longer-term strategy, then discuss how the role of human resource development can provide valuable advice and counsel in the...
-
American Dream, is a retail and entertainment complex located in the Meadowlands Sports Complex in Rutherford, NJ. The mega-sized mall opened only a few months prior to the start of COVID in the...
-
An investor purchased a piece of waterfront property. Because of the development of a marina in the vicinity, the market value of the property is expected to increase according to the rule Kt) =...
-
Take the procedural design approach and create first the context diagram, and then the highest - level data flow and control - flow diagrams for an electronic lock in the laboratory door having the...
-
Suppose that, in order to protect Ronaldo from his adoring fans, soccer teams that host Real Madrid must hire extra security, and security costs go up as the number of fans at the game goes up. When...
-
N.B. Vargaftik 3 (1975) lists the experimental values in the following table for the enthalpy departure of isobutane at 175C. Compute theoretical values and their percent deviations from experiment...
-
Suppose the can segment their fans into young fans and senior citizens. Young fans have the demand curve 120 10G (MR = 120 20G). Senior citizens have the demand curve p = 60 10G (MR = 60 20G)....
-
Recent research in thermodynamic perturbation theory suggests the following equation of state. (a) Derive the departure function for (A A ig ) T,V . (b) Derive the departure function for (U U ig )....
-
Define and contrast consultants vs. managers in your own words. How are they similar and how are they different? What new skills will YOU personally need to use as a consultant that may be different...
-
A business had revenues of $280,000 and operating expenses of $315,000. Did the business (a) Incur a net loss (b) Realize net income?
-
The controller of Newstrom Software Inc. provides the following information as the basis for a statement of cash flows: Required: 1. Calculate the net cash provided (used) by operating activities. 2....
-
On December 31, 2011, Felix Products borrowed $80,000 cash on a $105,800, 24-month zero percent note. Felix uses the straight-line method of amortization. Required: 1. Record the borrowing in Felixs...
-
Describe the amortized cost method of accounting for investments. Under which circum-stances should it be used?
-
What would be the most effective option to increase employee motivation to stay and reduce the driver turnover rate? Why do you believe this option will be effective?
-
How else might the manager have handled the situation to prevent potential issues, including a negative impact on the teams performance?
-
In what ways do you believe providing special work arrangements or accommodations for employees impacts employee motivation? How does it help? How does it hurt?
Study smarter with the SolutionInn App