Code running on a single core and not sharing any variables with other cores can suffer some

Question:

Code running on a single core and not sharing any variables with other cores can suffer some performance degradation because of the snooping coherence protocol.
Consider the two following iterative loops are NOT functionally equivalent but they seem similar in complexity. One could be led to conclude that they would spend a comparably close number of cycles when executed on the same processor core.

image text in transcribed

Assume that

■ Every cache line can hold exactly one element of A or B;

■ Arrays A and B do not interfere in the cache;

■ All the elements of A or B are in the cache before either loop is executed.
Compare their performance when run on a core whose cache uses the MESI coherence protocol. Use the stall cycles data for Implementation 1 in Figure 5.38.
Assume that a cache line can hold multiple elements of A and B (A and B go to separate cache lines). How will this affect the relative performances of Loop1 and Loop2?

Suggest hardware and/or software mechanisms that would improve the performance of Loop1 on a single core.

image text in transcribed

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  answer-question

Computer Architecture A Quantitative Approach

ISBN: 9780128119051

6th Edition

Authors: John L. Hennessy, David A. Patterson

Question Posted: