Question: Consider a memory system with a single cycle cache and 1 0 0 cycle latency DRAM with the processor and DRAM both operating at 1

Consider a memory system with a single cycle cache and 100 cycle latency DRAM with
the processor and DRAM both operating at 1 GHz. The processor has two multiply-add
units and is capable of executing four instructions in each cycle. In each memory cycle, the
processor fetches four words. Now consider the problem of multiplying a dense matrix with
a vector using a two-loop dot-product formulation. The matrix is of dimension 2K \times 2K.
Thus, each row of the matrix takes 8 KB of storage. Assume the vector is already cached,
what is the sustained performance and arithmetic intensity in the best case of this technique
using a two-loop dot-product based matrix-vector product? (State your assumptions)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!