Question: Write a kernel program for matrix multiplication (C=A*B). Assume that matrices are squared. Each thread in the kernel should calculate two elements of matrix C.

Write a kernel program for matrix multiplication (C=A*B). Assume that matrices are squared. Each thread in the kernel should calculate two elements of matrix C. For example, if dimension of the matrices is 10*10, then 50 threads are launched. Thread zero should calculate Coo and Con, thread one should calculate Coz and Co3, Assume that only one work-group is launched and the threads within the work-group are organized in one dimension. kernel matrix_mult(const int Mdim, _global float* A,_global float* B, global float* C) { } Write a kernel program for matrix multiplication (C=A*B). Assume that matrices are squared. Each thread in the kernel should calculate two elements of matrix C. For example, if dimension of the matrices is 10*10, then 50 threads are launched. Thread zero should calculate Coo and Con, thread one should calculate Coz and Co3, Assume that only one work-group is launched and the threads within the work-group are organized in one dimension. kernel matrix_mult(const int Mdim, _global float* A,_global float* B, global float* C) { }
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
