Question: onsider the following code, which sums the elements of a product of two matrices: register int i , k; / * i , k are
onsider the following code, which sums the elements of a product of two matrices:
register int i k; i k are in the processor registers
register float sum, a b;
fori ; i ; i
fork ; k ; k
sum aik bki;
Assume the following:
There is a perfect instruction cache; ie do not worry about the time for any instruction accesses.
Both int and float are of size bytes.
Assume that only the accesses to the arrays a and b generate accesses to the data cache. The rest of
the variables are all allocated in registers.
Assume a fully associative, LRU data cache with lines, where each line is bytes.
Initially, the data cache is empty.
The arrays a and b are stored in row order.
To keep things simple, we will assume that statements in the above code are executed sequentially.
Lines and take cycles for each invocation. Line takes cycles plus an additional
cycles per data cache miss to wait for the data. That is if both array accesses in line miss, it takes
a total of cycles.
Assume that the arrays a and b both start at cache line boundaries.
a How many accesses to arrays a and b will result in cache misses? Explain your answer.
b
Now assume there is a data prefetch instruction with the format prefetch arrayindexindex This
prefetches the entire block containing the word arrayindexindex into the data cache.
It takes cycle for the processor to execute this instruction and send it to the data cache. The processor
can then go ahead and execute subsequent instructions. If the prefetched data is not in the cache, it takes
cycles for the data to get loaded into the cache. Add prefetch instructions to minimize the execution
time. Do not transform the code in any other way. How many cache misses for accessing a and b at line
in your modified code?
Hint: since line and each takes cycles when there is no cache miss, you can consider using
them to hide the cycle latency for the data to get loaded into the cache when prefetching. If you insert
the prefetch instructions appropriately, the cache misses can be totally eliminated.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
