Question: Consider the following tiled matrix multiplication code: #define TILE _ WIDTH _ _ global _ _ void MatrixMulKernel ( float * d _ M ,
Consider the following tiled matrix multiplication code:
#define TILEWIDTH
global void MatrixMulKernelfloat dM float dN float dP int Width
Question pointsConsider the matrix multiplication with the input Matrix M and N M and N are both square matrixes, and the size of M and N is width width here width is a power of ie width i How many times is each element in the input matrixes requested from global memory when:
There is no tiling.
Tiles of size TT are used Suppose T is a power of ie Tj
shared float dsMTILEWIDTHTILEWIDTH;
shared float dsNTILEWIDTHTILEWIDTH;
int bx blockIdx.x; int by blockIdx.y;
int tx threadIdx.x; int ty threadIdx.y;
Identify the row and column of the Pd element to work on
int Row by TILEWIDTH ty;
int Col bx TILEWIDTH tx;
float Pvalue ;
Loop over the Md and Nd tiles required to compute the Pd element
for int m ; m
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
