Question: optimize matrix multiplication (matmul) code to run fast on a single processor core of XSEDE's Bridges cluster. We consider a special case of matmul: C

optimize matrix multiplication (matmul) code to run fast on a single processor core of XSEDE's Bridges cluster. We consider a special case of matmul: C := C + A*B where A, B, and C are n x n matrices. This can be performed using 2n3 floating point operations (n3 adds, n3 multiplies), as in the following pseudocode:

for i = 1 to n for j = 1 to n for k = 1 to n C(i,j) = C(i,j) + A(i,k) * B(k,j) end end end

The task is to optimize the previous code using C-language

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!