New Semester
Started
Get
50% OFF
Study Help!
--h --m --s
Claim Now
Question Answers
Textbooks
Find textbooks, questions and answers
Oops, something went wrong!
Change your search query and then try again
S
Books
FREE
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Tutors
Online Tutors
Find a Tutor
Hire a Tutor
Become a Tutor
AI Tutor
AI Study Planner
NEW
Sell Books
Search
Search
Sign In
Register
study help
computer science
computer systems a programmers perspective
Computer Systems A Programmers Perspective 3rd Global Edition Randal E. Bryant, David R. O'Hallaron - Solutions
Suppose the order of the third and fourth cases (the two forwarding sources from the memory stage) in the HCL code for d_valA were reversed. Describe the resulting behavior of the rrmovq instruction (line 5) for the following program: 1 2 3 4 5 irmovq $5, %rdx irmovą $0x100,%rsp rmmovq %rdx,
Write HCL code for the signal d_valB, giving the value for source operand valB supplied to pipeline register E.
Our second case in the HCL code for d_valA uses signal e_dstE to see whether to select the ALU output e_valE as the forwarding source. Suppose instead that we use signal E_dstE, the destination register ID in pipeline register E for this selection. Write a Y86-64 program that would give an
In this stage, we can complete the computation of the status code Stat by detecting the case of an invalid address for the data memory. Write HCL code for the signal m_stat.
Write a Y86-64 assembly-language program that causes combination A to arise and determines whether the control logic handles it correctly.
Write a Y86-64 assembly-language program that causes combination B to arise and completes with a halt instruction if the pipeline operates correctly.
Write HCL code for the signal D_stall in the PIPE implementation.
Write HCL code for the signal E_bubble in the PIPE implementation.
Write HCL code for the signal set_cc in the PIPE implementation. This should only occur for OPq instructions, and should consider the effects of program exceptions.
Write HCL code for the signals M_bubble and W_stall in the PIPE implementation. The latter signal requires modifying the exception condition listed in Figure 4.64.Figure 4.64 Condition Processing ret Load/use hazard Mispredicted branch Exception Trigger IRET € {D_icode, E_icode, M_icode} E_icode
Let us analyze the relative performance of using conditional data transfers versus conditional control transfers for the programs you wrote for Problems 4.5 and 4.6. Assume that we are using these programs to compute the sum of the absolute values of a very long array, and so the overall
The following problem illustrates the way memory aliasing can cause unexpected program behavior. Consider the following procedure to swap two values: 1 2 3 4 5 67 /* Swap value x at xp with value y at yp */ void swap (long *xp, long *yp) { *xp = *xp + *yp; /* x+y *yp *xp *xp = *xp = - */ *yp: /*
Later in this chapter we will start with a single function and generate many different variants that preserve the function’s behavior, but with different performance characteristics. For three of these variants, we found that the run times (in clock cycles) can be approximated by the following
Consider the following functions:Assume x equals 10 and y equals 100. Fill in the following table indicating the number of times each of the four functions is called in code fragments A–C: long min (long x, long y) { return x < y ? x = y; } long max(long x, long y) { return x < y ? y: x; } void
When we use gcc to compile combine3 with command-line option -02, we get code with substantially better CPE performance than with -01:We achieve performance comparable to that for combine4, except for the case of integer sum, but even it improves significantly. On examining the assembly code
Suppose we wish to write a function to evaluate a polynomial, where a polynomial of degree n is defined to have a set of coefficients a0, a1, a2, . . . , an. For a value x, we evaluate the polynomial by computingThis evaluation can be implemented by the following function, having as arguments an
Consider the following function for computing the product of an array of n double-precision numbers. We have unrolled the loop by a factor of 3.For the line labeled “Product computation,” we can use parentheses to create five different associations of the computation, as follows:Assume we run
Suppose the disk file foobar.txt consists of the six ASCII characters foobar. Then what is the output of the following program? 1 2 3 4 5 6 7 8 9 10 11 12 13. 14 #include "csapp.h" int main() { } int fd1, fd2; char c; fd1 = Open ("foobar.txt", fd2 Open ("foobar.txt", Read (fd1, &c, 1); Read (fd2,
The getaddrinfo and getnameinfo functions subsume the functionality of inet_pton and inet_ntop, respectively, and they provide a higher-level of abstraction that is independent of any particular address format. To convince yourself how handy this is, write a version of HOSTINFO (Figure 11.17) that
We can represent a bit pattern of length w = 4 with a single hex digit. For a two’s complement interpretation of these digits, fill in the following table to determine the additive inverses of the digits shown:What do you observe about the bit patterns generated by two’s-complement and unsigned
You are assigned the task of writing code for a function tsub_ok, with arguments x and y, that will return 1 if computing x-y does not cause overflow. Having just written the code for Problem 2.30, you write the following:For what values of x and y will this function give incorrect results? Writing
Your coworker gets impatient with your analysis of the overflow conditions for two’s-complement addition and presents you with the following implementation of tadd_ok:You look at the code and laugh. Explain why. /* Determine whether arguments can be added without overflow */ /*WARNING: This code
Write a function with the following prototype:This function should return 1 if arguments x and y can be added without causing overflow. /* Determine whether arguments can be added without overflow */ int tadd_ok (int x, int y);
Fill in the following table in the style of Figure 2.25. Give the integer values of the 5-bit arguments, the values of both their integer and two's-complement sums, the bit-level representation of the two's-complement sum, and the case from the derivation of Equation 2.13.Fig. 2.25Eq. 2.13
We can represent a bit pattern of length w = 4 with a single hex digit. For an unsigned interpretation of these digits, use Equation 2.12 to fill in the following table giving the values and the bit representations (in hex) of the unsigned additive inverses of the digits shown.Eq. 2.12
Write a function with the following prototype:This function should return 1 if arguments x and y can be added without causing overflow. /* Determine whether arguments can be added without overflow */ int uadd_ok (unsigned x, unsigned y);
You are given the assignment of writing a function that determines whether one string is longer than another. You decide to make use of the string library function strlen having the following declaration:When you test this on some sample data, things do not seem to work quite right. You investigate
Suppose we truncate a 4-bit value (represented by hex digits 0 through F) to a 3 bit value (represented as hex digits 0 through 7.) Fill in the table below showing the effect of this truncation for some cases, in terms of the unsigned and two's complement interpretations of those bit
Consider the following code that attempts to sum the elements of an array a, where the number of elements is given by parameter length:When run with argument length equal to 0, this code should return 0.0. Instead, it encounters a memory error. Explain why this happens. Show how this code can be
Consider the following C functions:Assume these are executed as a 32-bit program on a machine that uses two’s complement arithmetic. Assume also that right shifts of signed values are performed arithmetically, while right shifts of unsigned values are performed logically.A. Fill in the following
Show that each of the following bit vectors is a two’s-complement representation of −4 by applying Equation 2.3:A. [1100]B. [11100]C. [111100]Observe that the second and third bit vectors can be derived from the first by sign extension.Eq. 2.3 2 B2T (x) = =Xw_12"-1+ 2 x 2 i=0 (2.3)
Assuming the expressions are evaluated when executing a 32-bit program on a machine that uses two's-complement arithmetic, fill in the following table describing the effect of casting and relational operations, in the style of Figure 2.19:Figure 2.19 Expression -2147483647-1 ==
Using the table you filled in when solving Problem 2.17, fill in the following table describing the function T2U4:Problem 2.17Assuming w = 4, we can assign a numeric value to each possible hexadecimal digit, assuming either an unsigned or a two's-complement interpretation. Fill in the following
In Chapter 3, we will look at listings generated by a disassembler, a program that converts an executable program file back to a more readable ASCII form. These files contain many hexadecimal numbers, typically representing values in two'scomplement form. Being able to recognize these numbers and
Assuming w = 4, we can assign a numeric value to each possible hexadecimal digit, assuming either an unsigned or a two's-complement interpretation. Fill in the following table according to these interpretations by writing out the nonzero powers of 2 in the summations shown in Equations 2.1 and
Fill in the table below showing the effects of the different shift operations on singlebyte quantities. The best way to think about shift operations is to work with binary representations. Convert the initial values to binary, perform the shifts, and then convert back to hexadecimal. Each of the
As an application of the property that a " a = 0 for any bit vector a, consider the following program:As the name implies, we claim that the effect of this procedure is to swap the values stored at the locations denoted by pointer variables x and y. Note that unlike the usual technique for swapping
Using only bit-level and logical operations, write a C expression that is equivalent to x == y. In other words, it will return 1 when x and y are equal and 0 otherwise.
Suppose that a and b have byte values 0x55 and 0x46, respectively. Fill in the following table indicating the byte values of the different C expressions: Expression a & b alb -al-b a & b Value Expression a && b allb la || !b a && -b Value
Write C expressions, in terms of variable x, for the following values. Your code should work for any word size w ≥ 28. For reference, we show the result of evaluating the expressions for x = 0x87654321, with w = 32. A. The least significant byte of x, with all other bits set to 0.B. All but the
The Digital Equipment VAX computer was a very popular machine from the late 1970s until the late 1980s. Rather than instructions for Boolean operations AND and Or, it had instructions bis (bit set) and bic (bit clear). Both instructions take a data word x and a mask word m. They generate a result z
Computers generate color pictures on a video screen or liquid crystal display by mixing three different colors of light: red, green, and blue. Imagine a simple scheme, with three different lights, each of which can be turned on or off, projecting onto a glass screen:We can then create eight
Fill in the following table showing the results of evaluating Boolean operations on bit vectors. Operation b "a -b a&b alb a b Result [01001110] [11100001]
What would be printed as a result of the following call to show_bytes? const charm = "mnopqr"; show_bytes ((byte_pointer) m, strlen(m)); Note that letters 'a' through 'z' have ASCII codes 0x61 through Ox7A.
A car manufacturing company has promised their customers that the next release of a new engine will show a 4x performance improvement. You have been assigned the task of delivering on that promise. You have determined that only 90% of the engine can be improved. How much (i.e., what value of k)
Perform the following number conversions: A. 0x25B9D2 to binary B. binary 1010111001001001to hexadecimal C. 0xA8B3D to binary D. binary 1100100010110110010110 to hexadecimal
A single byte can be represented by 2 hexadecimal digits. Fill in the missing entries in the following table, giving the decimal, binary, and hexadecimal values of different byte patterns: Decimal 0 158 76 145 Binary 0000 0000 10101110 0011 1100 1111 0001 Hexadecimal 0x00
Fill in the blank entries in the following table, giving the decimal and hexadecimal representations of different powers of 2: S 5 23 12 2" (decimal) 32 32,768 64 2" (hexadecimal) 0x20 0x2000 0x100
Without converting the numbers to decimal or binary, try to solve the following arithmetic problems, giving the answers in hexadecimal. A. 0x605c+0x5= B. 0x605c- 0x20= C. 0x605c + 32 = D. 0x60fa0x605c=
Consider the following three calls to show_bytes:Indicate the values that will be printed by each call on a little-endian machine and on a big-endian machine: int a 0x12345678; byte_pointer ap = (byte_pointer) ka; show_bytes (ap, 1); /* A. */ show_bytes (ap, 2); /* B. */ show_bytes (ap, 3); /* C.
Using show_int and show_float, we determine that the integer 2607352 has hexa-decimal representation 0x0027C8F8, while the floating-point number 3510593.0 has hexadecimal representation 0x4A1F23E0. A. Write the binary representations of these two hexadecimal values. B. Shift these two strings
For the case where data type int has 32 bits, devise a version of tmult_ok (Problem 2.35) that uses the 64-bit precision of data type int64_t, without using division.Problem 2.35You are given the assignment to develop code for a function tmult_ok that will determine whether two arguments can be
You are given the assignment to develop code for a function tmult_ok that will determine whether two arguments can be multiplied without causing overflow. Here is your solution:You test this code for a number of values of x and y, and it seems to work properly. Your coworker challenges you, saying,
You are given the task of patching the vulnerability in the XDR code shown in the aside on page 136 for the case where both data types int and size_t are 32 bits. You decide to eliminate the possibility of the multiplication overflowing by computing the number of bytes to allocate using data type
How could we modify the expression for form B for the case where bit position n is the most significant bit?
For each of the following values of K, find ways to express x * K using only the specified number of operations, where we consider both additions and subtractions to have comparable cost. You may need to use some tricks beyond the simple form A and B rules we have considered so far.
For a run of ones starting at bit position n down to bit position m (n ≥ m), we saw that we can generate two forms of code, A and B. How should the compiler decide which form to use?
Write a function div16 that returns the value x/16 for integer argument x. Your function should not use division, modulus, multiplication, any conditionals (if or ?:), any comparison operators (e.g., , or ==), or any loops. You may assume that data type int is 32 bits long and uses a
In the following code, we have omitted the definitions of constants M and N:We compiled this code for particular values of M and N. The compiler optimized the multiplication and division using the methods we have discussed. The following is a translation of the generated machine code back into
Assume data type int is 32 bits long and uses a two’s-complement representation for signed values. Right shifts are performed arithmetically for signed values and logically for unsigned values. The variables are declared and initialized as follows:For each of the following C expressions,
The imprecision of floating-point arithmetic can have disastrous effects. On February 25, 1991, during the first Gulf War, an American Patriot Missile battery in Dharan, Saudi Arabia, failed to intercept an incoming Iraqi Scud missile. The Scud struck an American Army barracks and killed 28
A. For a floating-point format with an n-bit fraction, give a formula for the smallest positive integer that cannot be represented exactly (because it would require an (n + 1)-bit fraction to be exact). Assume the exponent field size k is large enough that the range of representable exponents does
Consider a 5-bit floating-point representation based on the IEEE floating-point format, with one sign bit, two exponent bits (k = 2), and two fraction bits (n = 2).The exponent bias is 22−1 − 1= 1.The table that follows enumerates the entire nonnegative range for this 5-bit floating-point
As mentioned in Problem 2.6, the integer 3,510,593 has hexadecimal representation 0x00359141, while the single-precision floating-point number 3,510,593.0 has hexadecimal representation 0x4A564504. Derive this floating-point representation and explain the correlation between the bits of the integer
The three functions in Figure 6.20 perform the same operation with varying degrees of spatial locality. Rank-order the functions with respect to the spatial locality enjoyed by each. Explain how you arrived at your ranking.Figure 6.20 (a) An array of structs 1 #define N
Permute the loops in the following function so that it scans the three-dimensional array a with a stride-1 reference pattern. 1234567 8 9 10 11 12 13 int productarray3d (int a[N] [N] [N]) int i, j, k, product for (i = N-1; i >= 0; i--) { for (j N-1; j >= 0; j--) { { } } = = } return product; 1; for
As we have seen, a potential drawback of SSDs is that the underlying flash memory can wear out. For example, for the SSD in Figure 6.14, Intel guarantees about 128 petabytes (128 × 1015 bytes) of writes before the drive wears out. Given this assumption, estimate the lifetime (in years) of this SSD
Suppose that a 1 MB file consisting of 512-byte logical blocks is stored on a disk drive with the following characteristics:For each case below, suppose that a program reads the logical blocks of the file sequentially, one after the other, and that the time to position the head over the first block
As another example of code with potential load-store interactions, consider the following function to copy the contents of one array to another:Suppose a is an array of length 1,000 initialized so that each element a[i] equals i.A. What would be the effect of the call copy_array(a+1,a,999)? B.
The traditional implementation of the merge step of mergesort requires three loops [98]:The branches caused by comparing variables i1 and i2 to n have good prediction performance—the only mispredictions occur when they first become false. The comparison between values src1[i1] and src2[i2] (line
We saw that our measurements of the prefix-sum function psum1 (Figure 5.1) yield a CPE of 9.00 on a machine where the basic operation to be performed, floating point addition, has a latency of just 3 clock cycles. Let us try to understand why our function performs so poorly.The following is the
Rewrite the code for psum1 (Figure 5.1) so that it does not need to repeatedly retrieve the value of p[i] from memory. You do not need to use loop unrolling. We measured the resulting code to have a CPE of 3.00, limited by the latency of floating-point addition.Figure 5.1
In the following, let r be the number of rows in a DRAM array, c the number of columns, br the number of bits needed to address the rows, and bc the number of bits needed to address the columns. For each of the following DRAMs, determine the power-of-2 array dimensions that minimize max(br, bc),
Estimate the average time (in ms) to access a sector on the following disk: Parameter Rotational rate Tavg seek Average number of sectors/track Value 12,000 RPM 5 ms 300
The following table gives the parameters for a number of different caches. For each cache, determine the number of cache sets (S), tag bits (t ), set index bits (s), and block offset bits (b). Cache m 32 32 32 1. 2. 3. C 1,024 1,024 1,024 B 4 8 32 E 4 32 S S b
What is the capacity of a disk with 3 platters, 15,000 cylinders, an average of 500 sectors per track, and 1,024 bytes per sector?
Using the data from the years 2005 to 2015 in Figure 6.15(c), estimate the year when you will be able to buy a petabyte (1015 bytes) of rotating disk storage for $200. Assume actual dollars (no inflation).Figure 6.15(c) Metric $/GB Min. seek time (ms) Typical size
Imagine a hypothetical cache that uses the high-order s bits of an address as the set index. For such a cache, contiguous chunks of memory blocks are mapped to the same cache set.A. How many blocks are in each of these contiguous array chunks? B. Consider the following code that runs on a system
The problems that follow will help reinforce your understanding of how caches work. Assume the following:. The memory is byte addressable.. Memory accesses are to 1-byte words (not to 4-byte words).. Addresses are 13 bits wide.. The cache is two-way set associative (E = 2), with a 4-byte block size
Transposing the rows and columns of a matrix is an important problem in signal processing and scientific computing applications. It is also interesting from a locality point of view because its reference pattern is both row-wise and column-wise. For example, consider the following transpose
The heart of the recent hit game SimAquarium is a tight loop that calculates the average position of 512 algae. You are evaluating its cache performance on a machine with a 2,048-byte direct-mapped data cache with 32-byte blocks (B = 32). You are given the following definitions:You should also
Given the assumptions of Practice Problem 6.18, determine the cache performance of the following code:A. What is the total number of reads?B. What is the total number of reads that hit in the cache?C. What is the hit rate?D. What would the miss hit be if the cache were twice as big?Problem 6.18The
Given the assumptions of Practice Problem 6.18, determine the cache performance of the following code:A. What is the total number of reads?B. What is the total number of reads that hit in the cache?C. What is the hit rate?D. What would the hit rate be if the cache were twice as big?Problem 6.18The
Use the memory mountain in Figure 6.41 to estimate the time, in CPU cycles, to read a 16-byte word from the L1 d-cache.Figure 6.41 Read throughput (MB/s) 16,000 14,000 12,000 10,000- 8,000- 6,000- 4,000- 2,000 0 Slopes of spatial locality s1 s3 s5 Stride (x8 bytes) s7 s9 Mem s11 128 M L3 32 M 8
This problem concerns the m.o and swap.o modules from Figure 7.5. For each symbol that is defined or referenced in swap.o, indicate whether or not it will have a symbol table entry in the .symtab section in module swap.o. If so, indicate the module that defines the symbol (swap.o or m.o), the
In this problem, let REF(x.i)→DEF(x.k) denote that the linker will associate an arbitrary reference to symbol x in module i to the definition of x in module k. For each example that follows, use this notation to indicate how the linker would resolve references to the multiply-defined symbol in
Let a and b denote object modules or static libraries in the current directory, and let a→b denote that a depends on b, in the sense that b defines a symbol that is referenced by a. For each of the following scenarios, show the minimal command line (i.e., one with the least number of object file
Consider the call to function swap in object file m.o (Figure 7.5).Now suppose that the linker relocates .text in m.o to address 0x4004d0 and swap to address 0x4004e8. Then what is the value of the relocated reference to swap in the callq instruction?Figure 7.5 9: e8 00 00 00 00 with the following
This problem concerns the relocated program in Figure 7.12(a).A. What is the hex address of the relocated reference to sum in line 5?B. What is the hex value of the relocated reference to sum in line 5?Figure 7.12(a) (a) Relocated .text section 1 00000000004004d0 : 4004d0: 48 83 ec 08 2 3 be 02 00
Write a wrapper function for sleep, called wakeup, with the following interface:unsigned int wakeup(unsigned int secs);The wakeup function behaves exactly as the sleep function, except that it prints a message describing when the process actually woke up:Woke up at 4 secs.
Write a program called myecho that prints its command-line arguments and environment variables. For example: linux> ./myecho argi arg2 Command-ine arguments: argv[0]: myecho argv[ 1]: argi argv[ 2] arg2 Environment variables: envp[0]: PWD=/usro/droh/ics/code/ecf envp [1] TERM=emacs : envp [25]
Write a program called snooze that takes a single command-line argument, calls the snooze function from Problem 8.5 with this argument, and then terminates. Write your program so that the user can interrupt the snooze function by typing Ctrl+C at the keyboard. For example:Problem 8.5Write a wrapper
Complete the following table, filling in the missing entries and replacing each question mark with the appropriate integer. Use the following units: K = 210 (kilo), M= 220 (mega), G = 230 (giga), T = 240 (tera), P = 250 (peta), or E = 260 (exa). Number of virtual address bits (n) 54 Number
Determine the number of page table entries (PTEs) that are needed for the following combinations of virtual address size (n) and page size (P): n 12 16 24 36 P=2P 1K 16 K 2M 1G Number of PTES
Given a 64-bit virtual address space and a 32-bit physical address, determine the number of bits in the VPN, VPO, PPN, and PPO for the following page sizes P: P 1 KB 2 KB 4 KB 16 KB VPN bits Number of VPO bits PPN bits PPO bits
Translates a virtual address into a physical address and accesses the cache. For the given virtual address, indicate the TLB entry accessed, physical address, and cache byte value returned. Indicate whether the TLB misses, whether a page fault occurs, and whether a cache miss occurs. If there is a
Determine the block sizes and header values that would result from the following sequence of malloc requests. Assumptions: (1) The allocator maintains double-word alignment and uses an implicit free list with the block format from Figure 9.35. Figure 9.35(2) Block sizes are rounded up to the
Write a C program mmapcopy.c that uses mmap to copy an arbitrary-size disk file to stdout. The name of the input file should be passed as a command-line argument.
Determine the minimum block size for each of the following combinations of alignment requirements and block formats. Assumptions: Implicit free list, zero size payloads are not allowed, and headers and footers are stored in 4-byte words. Alignment Single word Single word Double word Double
Implement a find_fit function for the simple allocator.static void *find_fit(size_t asize)Your solution should perform a first-fit search of the implicit free list.
Showing 100 - 200
of 234
1
2
3
Step by Step Answers