This part of our case study will focus on the amount of instruction-level parallelism available to the

Question:

This part of our case study will focus on the amount of instruction-level parallelism available to the run time hardware scheduler under the most favorable execution scenarios (the ideal case). (Later, we will consider less ideal scenarios for the run time hardware scheduler as well as the amount of parallelism available to a compiler scheduler.) For the ideal scenario, assume that the hash table is initially empty. Suppose there are 1024 new data elements, whose values are numbered sequentially from 0 to 1023, so that each goes in its own bucket (this reduces the problem to a matter of updating known array locations!). Figure 3.15 shows the hash table contents after the first three elements have been inserted, according to this "ideal case." Since the value of element[i] is simply i in this ideal case, each element is inserted into its own bucket.
For the purposes of this case study, assume that each line of code in Figure 3.14 takes one execution cycle (its dependence height is 1) and, for the purposes of computing ILP, takes one instruction. These (unrealistic) assumptions are made to greatly simplify bookkeeping in solving the following exercises. And while statements execute on each iteration of their respective loops, to test if the loop should continue. In this ideal case, most of the dependences in the code sequence are relaxed and a high degree of ILP is therefore readily available. We will later examine a general case, in which the realistic dependences in the code segment reduce the amount of parallelism available.
Further suppose that the code is executed on an "ideal" processor with infinite issue width, unlimited renaming, "omniscient" knowledge of memory access disambiguation, branch prediction, and so on, so that the execution of instructions is limited only by data dependence. Consider the following in this context:
a. Describe the data (true, anti, and output) and control dependences that govern the parallelism of this code segment, as seen by a run time hardware scheduler. Indicate only the actual dependences (i.e., ignore dependences between stores and loads that access different addresses, even if a compiler or processor would not realistically determine this). Draw the dynamic dependence graph for six consecutive iterations of the outer loop (for insertion of six elements), under the ideal case. In this dynamic dependence graph, we are identifying data dependences between dynamic instances of instructions: each static instruction in the original program has multiple dynamic instances due to loop execution. The following definitions may help you find the dependences related to each instruction:
• Data true dependence: On the results of which previous instructions does each instruction immediately depend?
• Data anti dependence: Which instructions subsequently write locations read by the instruction?
• Data output dependence: Which instructions subsequently write locations written by the instruction?
• Control dependence: On what previous decisions does the execution of a particular instruction depend (in what case will it be reached)?
b. Assuming the ideal case just described, and using the dynamic dependence graph you just constructed, how many instructions are executed, and in how many cycles?
c. What is the average level of ILP available during the execution of the for loop?
d. In part (c) we considered the maximum parallelism achievable by a run-time hardware scheduler using the code as written. How could a compiler increase the available parallelism, assuming that the compiler knows that it is dealing with the ideal case. Think about what is the primary constraint that prevents executing more iterations at once in the ideal case. How can the loop be restructured to relax that constraint?
e. For simplicity, assume that only variables i, hash_index, ptrCurr, and ptrUpdate need to occupy registers. Assuming general renaming, how many registers are necessary to achieve the maximum achievable parallelism in part (b)?
f. Assume that in your answer to part (a) there are 7 instructions in each iteration. Now, assuming a consistent steady-state schedule of the instructions in the example and an issue rate of 3 instructions per cycle, how is execution time affected?
g. Finally, calculate the minimal instruction window size needed to achieve the maximal level of parallelism?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Answer rating: 50% (12 reviews)

a Figure L20 shows the dependence graph for the C code in Figure 314 Each node in Figure L20 corresponds to a line of C statement in Figure 314 Each node 6 in Figure L20 starts an iteration of the for ...View the full answer

Answered By

Ashington Waweru

I am a lecturer, research writer and also a qualified financial analyst and accountant. I am qualified and articulate in many disciplines including English, Accounting, Finance, Quantitative spreadsheet analysis, Economics, and Statistics. I am an expert with sixteen years of experience in online industry-related work. I have a master's in business administration and a bachelor’s degree in education, accounting, and economics options. I am a writer and proofreading expert with sixteen years of experience in online writing, proofreading, and text editing. I have vast knowledge and experience in writing techniques and styles such as APA, ASA, MLA, Chicago, Turabian, IEEE, and many others. I am also an online blogger and research writer with sixteen years of writing and proofreading articles and reports. I have written many scripts and articles for blogs, and I also specialize in search engine I have sixteen years of experience in Excel data entry, Excel data analysis, R-studio quantitative analysis, SPSS quantitative analysis, research writing, and proofreading articles and reports. I will deliver the highest quality online and offline Excel, R, SPSS, and other spreadsheet solutions within your operational deadlines. I have also compiled many original Excel quantitative and text spreadsheets which solve client’s problems in my research writing career. I have extensive enterprise resource planning accounting, financial modeling, financial reporting, and company analysis: customer relationship management, enterprise resource planning, financial accounting projects, and corporate finance. I am articulate in psychology, engineering, nursing, counseling, project management, accounting, finance, quantitative spreadsheet analysis, statistical and economic analysis, among many other industry fields and academic disciplines. I work to solve problems and provide accurate and credible solutions and research reports in all industries in the global economy. I have taught and conducted masters and Ph.D. thesis research for specialists in Quantitative finance, Financial Accounting, Actuarial science, Macroeconomics, Microeconomics, Risk Management, Managerial Economics, Engineering Economics, Financial economics, Taxation and many other disciplines including water engineering, psychology, e-commerce, mechanical engineering, leadership and many others. I have developed many courses on online websites like Teachable and Thinkific. I also developed an accounting reporting automation software project for Utafiti sacco located at ILRI Uthiru Kenya when I was working there in year 2001. I am a mature, self-motivated worker who delivers high-quality, on-time reports which solve client’s problems accurately. I have written many academic and professional industry research papers and tutored many clients from college to university undergraduate, master's and Ph.D. students, and corporate professionals. I anticipate your hiring me. I know I will deliver the highest quality work you will find anywhere to award me your project work. Please note that I am looking for a long-term work relationship with you. I look forward to you delivering the best service to you.

3.00+ 2+ Reviews 10+ Question Solved

Related Book For book-img-for-question

Computer Architecture A Quantitative Approach

ISBN: 978-0123704900

4th edition

Authors: John L. Hennessy, David A. Patterson

See More Books

Question Posted: Apr 30, 2016 12:18 PM

See More Questions

This part of our case study will focus on the amount of instruction-level parallelism available to the

Question:

Step by Step Answer:

a Figure L20 shows the dependence graph for the C code in Figure 314 Each node in Figure L20 corresponds to a line of C statement in Figure 314 Each node 6 in Figure L20 starts an iteration of the for ...View the full answer

Computer Architecture A Quantitative Approach

Students also viewed these Computer Sciences questions