Question: Could use some help thank you You experiment with an embedded device having a one-level data cache (128 Bytes) and a main memory (1K Bytes.
Could use some help thank you
You experiment with an embedded device having a one-level data cache (128 Bytes) and a main memory (1K Bytes. You exclusively focus on data accesses instead of instruction access). The latencies (in CPU cycles) of the different kinds of accesses are as follows: Cache hit: 1 cycle; Cache miss: 110 cycles; Main memory access with cache disabled: 80 cycles; Now, Considering the following matrix multiplication C=A x B, please answer the following questions (please show detailed steps) Xoo0 Ut Xo A= : l and B = Xmo * Xmyl Yoo J'u,nl Yio \" Yin 1) What is the dimension of the matrix C? 2) For multiplication using ALU, assume you pre-load both A and B into the main memory from the storage and reserve space for C in the main memory to speed up the performance. If pre- loading A and B takes 50% of the main memory space and reserved space for C takes 12.5% of the main memory. Assuming we know the value of m, then what is the maximum value of [? 3) Following the result from 2), If x is a 16-bit integer, what are the dimensions of A, B, and C? 4) Following the result from 3), Assume that the cache is a fully associative cache with the least recently used cache replacement policy (LRU) (ask me or Wikipedia if you forgot) and the result can be directly written back to the main memory without sacrificing the memory read, what is the total memory access time (total cycles) for the matrix multiplication if we strictly follow the instructions as follows? for (int i=0; i
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
