Question: You are given a skeleton sequential program for matrix multiplication below: #include #include #include #define M 500 #define N 500 int main(int argc, char *argv)
You are given a skeleton sequential program for matrix multiplication below:
#include
#include
#include
#define M 500
#define N 500
int main(int argc, char *argv) {
//set number of threads here
omp_set_num_threads(8);
int i, j, k;
double sum;
double **A, **B, **C;
A = malloc(M*sizeof(double *));
B = malloc(M*sizeof(double *));
C = malloc(M*sizeof(double *));
for (i = 0; i < M; i++) {
A[i] = malloc(N*sizeof(double));
B[i] = malloc(N*sizeof(double));
C[i] = malloc(N*sizeof(double));
}
double start, end;
for (i = 0; i < M; i++) {
for (j = 0; j < N; j++) {
A[i][j] = j*1;
B[i][j] = i*j+2;
C[i][j] = j-i*2;
}
}
start = omp_get_wtime();
for (i = 0; i < M; i++) {
for (j = 0; j < N; j++) {
sum = 0;
for (k=0; k < M; k++) {
sum += A[i][k]*B[k][j];
}
C[i][j] = sum;
}
}
end = omp_get_wtime();
printf("Time of computation: %f ", end-start);
}
You are to parallelize this algorithm in three different ways:
1. Add the necessary pragma to parallelize the outer for loop
2. Remove the pragma for the outer for loop and create a pragma for the middle for loop
3. Add the necessary pragmas to parallelize both the outer and middle for loops
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
