Chip multiprocessors (CMPs) have multiple cores and their caches on a single chip. CMP on-chip L2 cache

Question:

Chip multiprocessors (CMPs) have multiple cores and their caches on a single chip. CMP on-chip L2 cache design has interesting trade-off s. Th e following table shows the miss rates and hit latencies for two benchmarks with private vs. shared L2 cache designs. Assume L1 cache misses once every 32 instructions.

Private Shared Benchmark A misses-per-instruction 0.30% 0.12% Benchmark B misses-per-instruction 0.03% 0.06%

Assume the following hit latencies:

Private Cache Shared Cache Memory 20 180

1. Which cache design is better for each of these benchmarks? Use data to support your conclusion.

2. Shared cache latency increases with the CMP size. Choose the best design if the shared cache latency doubles. Off -chip bandwidth becomes the bottleneck as the number of CMP cores increases. Choose the best design if off -chip memory latency doubles.

3. Discuss the pros and cons of shared vs. private L2 caches for both single-threaded, multi-threaded, and multiprogrammed workloads, and reconsider them if having on-chip L3 caches.

4. Assume both benchmarks have a base CPI of 1 (ideal L2 cache). If having non-blocking cache improves the average number of concurrent L2 misses from 1 to 2, how much performance improvement does this provide over a shared L2 cache? How much improvement can be achieved over private L2?

5. Assume new generations of processors double the number of cores every 18 months. To maintain the same level of per-core performance, how much more off -chip memory bandwidth is needed for a processor released in three years?

6. Consider the entire memory hierarchy. What kinds of optimizations can improve the number of concurrent misses?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question
Question Posted: