New Semester Started Get 50% OFF Study Help! --h --m --s Claim Now

Question Answers

Textbooks

Find textbooks, questions and answers

Oops, something went wrong!

Change your search query and then try again

Result Not Found

Books FREE
Study Help
Tutors
AI Study Planner NEW
Sell Books

Search
Sign In
Register

study help
computer sciences
systems analysis and design

Computer organization and architecture designing for performance 8th edition William stallings - Solutions

When out-of-order completion is used in a superscalar processor, resumption of execution after interrupt processing is complicated, because the exceptional condition may have been detected as an instruction that produced its result out of order. The program cannot be restarted at the instruction
Consider the following sequence of instructions, where the syntax consists of an opcode followed by the destination register followed by one or two source registers:Assume the use of a four-stage pipeline: fetch, decode/issue, execute, write back. Assume that all pipeline stages take one clock
a. Identify the write-read, write-write, and read-write dependencies in the following instruction sequence:b. Rename the registers from part (a) to prevent dependency problems. Identify references to initial register values using the subscript "a" to the register reference.
Consider the "in-order-issue/in-order-completion" execution sequence shown in Figure 14.13.a. Identify the most likely reason why I2 could not enter the execute stage until the fourth cycle. Will "in-order issue/out-of-order completion" or "out-of-order issue/out-of-order completion" fix this? If
Figure 14.14 shows an example of a superscalar processor organization. The processor can issue two instructions per cycle if there is no resource conflict and no data dependence problem. There are essentially two pipelines, with four processing stages (fetch, decode, execute, and store). Each
Figure 14.15 is from a paper on superscalar design. Explain the three parts of the figure, and define w, x, y, and z.Figure 14.15 Figure for Problem 14.7
Yeh's dynamic branch prediction algorithm, used on the Pentium 4, is a two-level branch prediction algorithm. The first level is the history of the last n branches. The second level is the branch behavior of the last s occurrences of that unique pattern of the last n branches. For each conditional
Explain the distinction between the written sequence and the time sequence of an instruction.
What is the overall function of a processor's control unit?
Outline a three-step process that leads to a characterization of the control unit.
What basic tasks does a control unit perform?
Provide a typical list of the inputs and outputs of a control unit.
Your ALU can add its two input registers, and it can logically complement the bits of either input register, but it cannot subtract. Numbers are to be stored in two's complement representation. List the micro-operations your control unit must perform to cause a subtraction.
Show the micro-operations and control signals in the same fashion as Table 15.1 for the processor in Figure 15.5 for the following instructions:€¢ Load Accumulator€¢ Store Accumulator€¢ Add to Accumulator€¢ AND to Accumulator€¢ Jump€¢ Jump
Assume that propagation delay along the bus and through the ALU of Figure 15.6 are 20 and 100 ns, respectively. The time required for a register to copy data from the bus is 10 ns. What is the time that must be allowed fora. Transferring data from one register to another?b. Incrementing the program
Write the sequence of micro-operations required for the bus structure of Figure 15.6 to add a number to the AC when the number isa. An immediate operandb. A direct-address operandc. An indirect-address operand
A stack is implemented as shown in Figure 10.14. Show the sequence of micro operations fora. Poppingb. Pushing the stack
What is the difference between a hardwired implementation and a micro programmed implementation of a control unit?
How is a horizontal microinstruction interpreted?
What is the difference between horizontal and vertical microinstructions?
What is the difference between packed and unpacked microinstructions?
What is the difference between functional and resource encoding?
Assume a microinstruction set that includes a microinstruction with the following symbolic form:where AC0 is the sign bit of the accumulator and are the first seven bits of the microinstruction. Using this microinstruction, write a micro program that implements a Branch Register Minus (BRM) machine
A simple processor has four major phases to its instruction cycle: fetch, indirect, execute, and interrupt. Two 1-bit flags designate the current phase in a hardwired implementation. a. Why are these flags needed? b. Why are they not needed in a micro programmed control unit?
Consider the control unit of Figure 16.7. Assume that the control memory is 24 bits wide. The control portion of the microinstruction format is divided into two fields. A micro operation field of 13 bits specifies the micro-operations to be performed. An address selection field specifies a
How can unconditional branching be done under the circumstances of the previous problem? How can branching be avoided; that is, describe a microinstruction that does not specify any branch, conditional or unconditional.
We wish to provide 8 control words for each machine instruction routine. Machine instruction opcodes have 5 bits, and control memory has 1024 words. Suggest a mapping from the instruction register to the control address register.
A processor has 16 registers, an ALU with 16 logic and 16 arithmetic functions, and a shifter with 8 operations, all connected by an internal processor bus. Design a microinstruction format to specify the various micro-operations for the processor.
List and briefly define three types of computer system organization.
What are the chief characteristics of an SMP?
What are some of the potential advantages of an SMP compared with a uniprocessor?
What are some of the key OS design issues for an SMP?
What is the difference between software and hardware cache coherent schemes?
What is the meaning of each of the four states in the MESI protocol?
What are some of the key benefits of clustering?
What is the difference between failover and failback?
What are the differences among UMA, NUMA, and CC-NUMA?
Some of the diagrams show horizontal rows that are partially filled. In other cases, there are rows that are completely blank. These represent two different types of loss of efficiency. Explain.
Consider the pipeline depiction in Figure 12.13b, which is redrawn in Figure 17.25a, with the fetch and decode stages ignored, to represent the execution of thread A. Figure 17.25b illustrates the execution of a separate thread B. In both cases, a simple pipelined processor is used.a. Show an
Produce a vectorized version of the following program:
An application program is executed on a nine-computer cluster. A benchmark program took time T on this cluster. Further, it was found that 25% of T was time in which the application was running simultaneously on all nine computers. The remaining time, the application had to run on a single
The following FORTRAN program is to be executed on a computer, and a parallel version is to be executed on a 32-computer cluster.Suppose lines 2 and 4 each take two machine cycle times, including all processor and memory-access activities. Ignore the overhead caused by the software loop control
Consider the following two versions of a program to add two vectors:a. The program on the left executes on a uniprocessor. Suppose each line of code L2, L4, and L6 takes one processor clock cycle to execute. For simplicity, ignore the time required for the other lines of code. Initially all arrays
A multiprocessor with eight processors has 20 attached tape drives. There are a large number of jobs submitted to the system that each require a maximum of four tape drives to complete execution. Assume that each job starts running with only three tape drives for a long period before requiring the
Can you foresee any problem with the write-once cache approach on bus-based multiprocessors? If so, suggest a solution.
Consider a situation in which two processors in an SMP configuration, over time, require access to the same line of data from main memory. Both processors have a cache and use the MESI protocol. Initially, both caches have an invalid copy of the line. Figure 17.22 depicts the consequence of a read
Figure 17.23 shows the state diagrams of two possible cache coherence protocols. Deduce and explain each protocol, and compare each to MESI.Figure 17.23 Two Cache Coherence Protocols
Consider an SMP with both L1 and L2 caches using the MESI protocol. As explained in Section 17.3, one of four states is associated with each line in the L2 cache. Are all four states also needed for each line in the L1 cache? If so, why? If not, explain which state or states can be eliminated.
An earlier version of the IBM mainframe, the S/390 G4, used three levels of cache.As with the z990, only the first level was on the processor chip [called the processor unit (PU)].The L2 cache was also similar to the z990. An L3 cache was on a separate chip that acted as a memory controller, and
What organizational alternative is suggested by each of the illustrations in Figure 17.24?Figure 17.24 Diagram for Problem 18.9
Summarize the differences among simple instruction pipelining, superscalar, and simultaneous multithreading.
Give several reasons for the choice by designers to move to a multi core organization rather than increase parallelism within a single processor.
List some examples of applications that benefit directly from the ability to scale throughput with the number of cores.
List some advantages of a shared L2 cache among cores compared to separate dedicated L2 caches for each core.
Consider the following problem. A designer has available a chip and decided what fraction of the chip will be devoted to cache memory (L1, L2, L3). The remainder of the chip can be devoted to a single complex superscalar and/or SMT core or multiple somewhat simpler cores. Define the following
Convert the following binary numbers to their decimal equivalents: a. 001100 b. 000011 c. 011100 d. 111100 e. 101010
Convert the following hexadecimal numbers to their binary equivalents: a. E b. 1C c. A64 d. 1F.C e. 239.4
Equations (19.1) and (19.2) define the representation of numbers in base 10 and base 2, respectively. In general, for the representation in base g of = {... x2x1x0 ˆ™ x-1x-2x-3....}, the value of X isThus, 65 in base 7 is (6 Ã— 71) + (5 Ã— 70) = 47. Count from one
Perform the indicated base conversions: a. 548 to base 5 b. 3124 to base 7 c. 5206 to base 7 d. 122123 to base 9
What generalizations can you draw about converting a number from one base to a power of that base, e.g., from base 3 to base 9 (32) or from base 2 to base 4 (22) or base 8 (23)?
Convert the following decimal numbers to their binary equivalents: a. 64 b. 100 c. 111 d. 145 e. 255
Convert the following hexadecimal numbers to their decimal equivalents: a. C b. 9F c. D52 d. 67E e. ABCD
Convert the following hexadecimal numbers to their decimal equivalents: a. F.4 b. D3.E c. 1111.1 d. 888.8 e. EBA.C
Convert the following decimal numbers to their hexadecimal equivalents: a. 16 b. 80 c. 2560 d. 3000 e. 62,500
Convert the following decimal numbers to their hexadecimal equivalents: a. 204.125 b. 255.875 c. 631.25 d. 10000.00390625
Construct a truth table for the following Boolean expressions:a.b.c.d.
The Gray code is a binary code for integers. It differs from the ordinary binary representation in that there is just a single bit change between the representations of any two numbers. This is useful for applications such as counters or analog-to-digital converters where a sequence of numbers is
Design a 2 × 32 decoder using four 3 × 8 decoders (with enable inputs) and one 2 × 4 decoder.
Consider Figure 20.20. Assume that each gate produces a delay of 10 ns. Thus, the sum output is valid after 30 ns and the carry output after 0 ns. What is the total add time for a. 32-bit adder a. Implemented without carry look ahead, as in Figure 20.19? b. Implemented with carry look ahead and
An alternative form of the S-R latch has the same structure as Figure 20.22 but uses NAND gates instead of NOR gates.a. Redo Table 20.10a and 20.10b for S-R latch implemented with NAND gates.b. Complete the following table, similar to Table 20.10c
Consider the graphic symbol for the S-R flip-flop in Figure 20.27. Add additional lines to depict a D flip-flop wired from the S-R flip flop.Figure 20.27 Basic Flip-Flops
Show the structure of a PLA with three inputs (C, B, A) and four outputs (O0, O1, O2, O3), with the outputs defined as follows:
An interesting application of a PLA is conversion from the old, obsolete punched cards character codes to ASCII codes. The standard punched cards that were so popular with computers in the past had 12 rows and 80 columns where holes could be punched. Each column corresponded to one character, so
Simplify the following expressions according to the commutative law:a.b. A ˆ™ B + A ˆ™ C + B ˆ™ A c. (L ˆ™ M ˆ™ N)(A ˆ™ B)(C ˆ™ D ˆ™ E)(M ˆ™ N ˆ™ L) d. F ˆ™ (K + R) + S ˆ™ V + W
Simplify the following expressions:a. A = S ˆ™ T + V ˆ™ W + R ˆ™ S ˆ™ Tb. A = T ˆ™ U ˆ™ V + X ˆ™ Y + Yc. A = F ˆ™ (E + F + G)d. A = (P ˆ™ Q + R + S ˆ™ T)T ˆ™ Se.f. g. A = (B ˆ™ E
A combinational circuit is used to control a seven-segment display of decimal digits, as shown in Figure 20.36. The circuit has four inputs, which provide the four-bit code used in packed decimal representation (010 = 0000, ..... , 910 = 1001). The seven outputs define which segments will be
Design an 8-to-1 multiplexer.
Define predication and predicated execution.
Define control speculation.
What is the purpose of the NaT bit?
Define data speculation.
What is the difference between a hardware pipeline and a software pipeline?
Suppose that an IA-64 opcode accepts three registers as operands and produces one register as a result. What is the maximum number of different operations that can be defined in one major opcode family?
Consider the following source code segment:a. Write a corresponding Pentium assembly code segment. b. Rewrite as an IA-64 assembly code segment using predicated execution techniques.
Consider the following C program fragment dealing with floating-point values: a[i] = p * q; c = a[j]; The compiler cannot establish that I ≠ j, but has reason to believe that it probably is. a. Write an IA-64 program using an advanced load to implement this C program. The floating-point load and
Assume that a stack register frame is created with size equal to SOF = 48. If the size of the local register group is SOL = 16, a. How many output registers (SOO) are there? b. Which registers are in the local and output register groups?
What is the maximum effective number of major opcodes?
The initial Itanium implementation had two M-units and two I-units. Which of the templates in Table 21.3 cannot be paired as two bundles of instructions that could be executed completely in parallel?
An algorithm that can utilize four floating-point instructions per cycle is coded for IA-64. Should instruction groups contain four floating-point operations? What are the consequences if the machine on which the program runs has fewer than four floating-point units?
In Section 21.3, we introduced the following constructs for predicated execution:where crel is a relation, such as eq, ne, etc.; p1, p2, and p3 are predicate registers; a is either a register or an immediate operand; and b is a register operand. Fill in the following truth table:
For the predicated program in Section 21.3, which implements the flowchart of Figure 21.4, indicate a. Those instructions that can be executed in parallel b. Those instructions that can be bundled into the same IA-64 instruction bundle
List some reasons why it is worthwhile to study assembly language programming.
List some disadvantages of assembly language compared to high-level languages.
List some advantages of assembly language compared to high-level languages.
List and briefly define four different kinds of assembly language statements.
What is the difference between a one-pass assembler and a two-pass assembler?
Core War is a programming game introduced to the public in the early 1980s [DEWD84], which was popular for a period of 15 years or so. Core War has four main components: a memory array of 8000 addresses, a simplified assembly language Redcode, an executive program called MARS (an acronym for Memory
Describe the effect of this instruction: cmp eax, 1 Assume that the immediately preceding instruction updated the contents of eax.
Section B.1 includes a C program that calculates the greatest common divisor of two integers. a. Describe the algorithm in words and show how the program does implement the Euclid algorithm approach to calculating the greatest common divisor. b. Add comments to the assembly program of Figure B.3a
a. A 2-pass assembler can handle future symbols and an instruction can therefore use a future symbol as an operand. This is not always true for directives. The EQU directive, for example, cannot use a future symbol. The directive 'A EQU B+1' is easy to execute if B is previously defined, but

Showing 2600 - 2700 of 3385

First
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

Step by Step Answers

Services

Sitemap
Fun
Definitions
Become Tutor
Used Textbooks
Study Help Categories
Recent Questions
Expert Questions
Campus Wear
Sell Your Books

Company Info

Security
Copyrights
Privacy Policy
Terms & Conditions
SolutionInn Fee
Scholarship
Online Quiz
Give Feedback, Get Rewards

Get In Touch

About Us
Contact Us
Career
Jobs
FAQ
Student Discount
Campus Ambassador

Secure Payment

Download Our App

© 2026 SolutionInn. All Rights Reserved