The switched interconnect increases the performance of a snooping cache-coherent multiprocessor by allowing multiple requests to be

Question:

The switched interconnect increases the performance of a snooping cache-coherent multiprocessor by allowing multiple requests to be overlapped. Because the controllers and the networks are pipelined, there is a difference between an operation's latency (i.e., cycles to complete the operation) and overhead (i.e., cycles until the next operation can begin). For the multiprocessor illustrated in Figure 4.39, assume the following latencies and overheads:
€¢ CPU read and write hits generate no stall cycles.
€¢ A CPU read or write that generates a replacement event issues the corresponding Get Shared or Get Modified message before the Put Modified message (e.g., using a write back buffer).
€¢ A cache controller event that sends a request message (e.g., Get Shared) has latency Lsend_req and blocks the controller from processing other events for Osend_req cycles.
€¢ A cache controller event that reads the cache and sends a data message has latency Lsend_data and overhead Osend_data cycles.
€¢ A cache controller event that receives a data message and updates the cache has latency Lrcv_data and overhead Orcv_data.
€¢ A memory controller has latency Lread_memory and overhead Oread_memory cycles to read memory and send a data message.
€¢ A memory controller has latency Lwrite_memory and overhead Owrite_memory cycles to write a data message to memory.
€¢ In the absence of contention, a request message has network latency Lreq_msg and overhead Oreq_msg cycles.
€¢ In the absence of contention, a data message has network latency Ldata_msg and overhead Odata_msg cycles.
Consider an implementation with the performance characteristics summarized in Figure 4.41. For the following sequences of operations and the cache contents from Figure Figure 4.37 and the implementation parameters in Figure 4.41, how many stall cycles does each processor incur for each memory request? Similarly, for how many cycles are the different controllers occupied? For simplicity, assume (1) each processor can have only one memory operation outstanding at a time, (2) if two nodes make requests in the same cycle and the one listed first "wins," the
The switched interconnect increases the performance of a snooping cache-coherent

Figure 4.41 Switched snooping coherence latencies and overheads.
Later node must stall for the request message overhead, and (3) all requests map to the same memory controller.
a. P0: read 120
b. P0: write 120 c. P15: write 120 d. P1: read 110
e. P0: read 120 P15: read 128
f. P0: read 100 P1: write 110 g. P0: write 100

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Computer Architecture A Quantitative Approach

ISBN: 978-0123704900

4th edition

Authors: John L. Hennessy, David A. Patterson

Question Posted: