Question: 3 . [ Matrix Multiplication ] Matrix multiplication is a key operation supported in hardware by AI / DNN accelerator DSAs such as Google TPU
Matrix Multiplication Matrix multiplication is a key operation supported in hardware by AIDNN accelerator DSAs such as Google TPU and Tesla Dojo. So its worth analyzing the matrix multiplication calculation itself. One common way to depict matrix multiplication is with the following triply nested loop:
float aMK bKN cMN;
M N and K are constants.
for int i ; i M; i
for int j ; j N; j
for int k ; k K; k
cij aik bkj;
a Suppose that M N and K so that each of the dimensions are relatively prime. Write out the order of accesses to memory locations in each of the three matrices A B and C you might start with twodimensional indices, then translate those to memory addresses or offsets from the start of each matrix For which matrices are the elements accessed sequentially? Which are not? Assume rowmajor Clanguage memory ordering.
b Suppose that you transpose matrix B swapping its indices so that they are BNK instead. So now the innermost statement of the loop looks like:
cij aik bjk;
Now, for which matrices are the elements accessed sequentially?
c The innermost kindexed loop of our original routine performs a dotproduct operation. Suppose that you are a given a hardware unit that can perform an element dotproduct more efficiently than the raw C code, behaving effectively like this C function:
void hardwaredotfloat accumulator
const float aslice, const floatbslice
float total ;
for int k ; k ; k
total aslicek bslicek;
accumulator total;
How would you rewrite the routine with the transposed B matrix from part
c to use this function?
d Suppose that instead, you are given a hardware unit that performs an element saxpy operation, which behaves like this C function:
void hardwaresaxpyfloat accumulator
float a const float input
for int k ; k ; k
accumulatork a inputk;
Write another routine that uses the saxpy primitive to deliver equivalent results to the original loop, without the transposed memory ordering for the B matrix
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
