Question: 2. (40 points) (Assume that an integer is a word, and it takes 4 bytes) Consider a memory system with a single cycle cache and

 2. (40 points) (Assume that an integer is a word, and

2. (40 points) (Assume that an integer is a word, and it takes 4 bytes) Consider a memory system with a single cycle cache and 100 cycle latency DRAM with the processor oper- ating at 1 GHz. The processor has two multiply-add units and is capable of executing four instructions in each cycle. Each memory access can fetch four words. Now consider the problem of multiplying a dense matrix with a vector using a two-loop dot-product formulation. The matrix is of dimension 2K x 2K. Thus, each row of the matrix takes 8 KB of storage. Assume the vector is cached, what is the sustained performance in the best case of this technique using a two-loop dot-product based matrix-vector product? (State your assumptions) 1 2 /* matrix-vector product loop */ for (i = 0; i

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!