Question: The switched interconnect increases the performance of a snooping cache-coherent multiprocessor by allowing multiple requests to be overlapped. Because the controllers and the networks are

The switched interconnect increases the performance of a snooping cache-coherent multiprocessor by allowing multiple requests to be overlapped. Because the controllers and the networks are pipelined, there is a difference between an operation's latency (i.e., cycles to complete the operation) and overhead (i.e., cycles until the next operation can begin). For the multiprocessor illustrated in Figure 4.39, assume the following latencies and overheads:
€¢ CPU read and write hits generate no stall cycles.
€¢ A CPU read or write that generates a replacement event issues the corresponding Get Shared or Get Modified message before the Put Modified message (e.g., using a write back buffer).
€¢ A cache controller event that sends a request message (e.g., Get Shared) has latency Lsend_req and blocks the controller from processing other events for Osend_req cycles.
€¢ A cache controller event that reads the cache and sends a data message has latency Lsend_data and overhead Osend_data cycles.
€¢ A cache controller event that receives a data message and updates the cache has latency Lrcv_data and overhead Orcv_data.
€¢ A memory controller has latency Lread_memory and overhead Oread_memory cycles to read memory and send a data message.
€¢ A memory controller has latency Lwrite_memory and overhead Owrite_memory cycles to write a data message to memory.
€¢ In the absence of contention, a request message has network latency Lreq_msg and overhead Oreq_msg cycles.
€¢ In the absence of contention, a data message has network latency Ldata_msg and overhead Odata_msg cycles.
Consider an implementation with the performance characteristics summarized in Figure 4.41. For the following sequences of operations and the cache contents from Figure Figure 4.37 and the implementation parameters in Figure 4.41, how many stall cycles does each processor incur for each memory request? Similarly, for how many cycles are the different controllers occupied? For simplicity, assume (1) each processor can have only one memory operation outstanding at a time, (2) if two nodes make requests in the same cycle and the one listed first "wins," the

Figure 4.41 Switched snooping coherence latencies and overheads.
Later node must stall for the request message overhead, and (3) all requests map to the same memory controller.
a. P0: read 120
b. P0: write 120 c. P15: write 120 d. P1: read 110
e. P0: read 120 P15: read 128
f. P0: read 100 P1: write 110 g. P0: write 100

Implementation 1 Action send req send data rev data fead memory write_memory req msg data msg Latency Overhead 4 20 15 100 100 20 20 30

Step by Step Solution

★★★★★

3.26 Rating (161 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock

a P0 read 120 Read miss service in memory P... View full answer

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Document Format (1 attachment)

903-C-S-S-A-D (3186).docx

120 KBs Word File

Students Have Also Explored These Related Systems Analysis And Design Questions!

The performance of a single-level cache system for a read operation can be characterized by the following equation: Ta = Tc + (1 - H)Tm where Ta is the average access time, Tc is the cache access...

The advanced directory protocol described above relies on a point-to-point ordered interconnect to ensure correct operation. Assuming the initial cache contents of Figure 4.42 and the following...

Multiple requests accessing a single remote database, figure Multiple Requests to a Single RemoteDBMS REQUEST REQUEST REQUESIT DBMS

3 . The switched interconnect increases the performance of a snooping cache - coherent multiprocessor by allowing multiple requests to be overlapped. Because the controllers and the networks are...

Calculate the correlation coefficientr, letting Row 1 represent thex-values and Row 2 they-values. Then calculate itagain, letting Row 2 represent thex-values and Row 1 they-values. What effect does...

I need an accounting quality analysis of financial statements of At&t. Identify key accounting policies, assess accounting flexibility within key accounting policies, evaluate accounting strategy,...

EverTrain is a fast growing organization that specializes in corporate training. It operates offices in 13 cities around the world. They specialize in government contract, human resource, customer...

12A A A O EEE Ib & BI va x x A Avoraus A A L A d edicate Styles Sensitivity IQ All Chapter 8 Mini Case: Backbone Networks About 1,87 fr4g 4r3 a: Part 2 Remembering the story from Chapter 7, about...

BI UV ab X, X ADA XX ANAL Styles AL Sens Chapter 8 Mini case: Backbone Networks fr4g 4r3Na: Part 2 Remembering the story from chapter 7, about starting a business, called frdg4r3No, to host PC gaming...

Section A (Short Answers) - 2 mots ench A network topology that shares a single medium with multiple nede is! Ate with collection of Telecommunication equipment own by SP is known 3. To generate...

During fiscal year 2008, Dell repurchased 179 million shares on the market for $4,004 million. There were 2,239 million shares outstanding prior to the repurchase. What was the effect of the...

Eva Stone's home in Chicago was recently gutted in a fire. Her living and dining rooms were destroyed completely, and the damaged personal property had a replacement price of $31,000. The average age...

# 1 ) ( 5 pts ) For the four algorithms, each having n inputs, the complexities for doing the same task are A: n f l , B : n l o g l n , C : n ( l o g l n ) 2 , and D : n f l o g f n . Rank the...

Which set of factors would produce the narrowest confidence interval for estimating a population mean? a. A large sample and a large percentage of confidence b. A large sample and a small percentage...

Consider this high-level code sequence of three statements: A = B + C; B = A + C; D = A B; Use the technique of copy propagation (see Figure A.20) to transform the code sequence to the point where no...

The design of MIPS provides for 32 general-purpose registers and 32 floating-point registers. If registers are good, are more registers better? List and discuss as many trade-offs as you can that...

As a financial analyst for a bread and bakery company, what external event might significantly impact your financial projections? - the global growth of ride - sharing services - war between Russia...

Yohe Telecommunications is a multinational corporation that produces and distributes telecommunications technology. Although its corporate headquarters are located in Maitland, Florida, Yohe usually...

An index model regression applied to past monthly returns in Ford's stock price produces the following estimates, which are believed to be stable over time: f*0.1% + 1.1rM If the market index...