Question: In this lab you will use the Performance API (PAPI) to study the performance of the matrix multiplication of two matrices B and C of

In this lab you will use the Performance API (PAPI) to study the performance of the matrix multiplication of two matrices B and C of type double.

A = B * C

A, B and C are all M x M matrices, where M = 1024.

You may use the program you wrote in Lab 3 and add calls to PAPI to collect data such as floating point operations (PAPI_FP_OPS) and data cache misses (PAPI_L1_DCM).

The PAPI installation is at /opt/sw/papi.

Look at the attached makefilefor the changes to the FLAGS and LD_FLAGS arguments.

The program will NOT use OMP. In other words, you will not parallelize matrix multiplication. You will just collect statistics and contents of some counters. Hopefully you can collect statistics on data cache misses. Next you modify the algorithm so you do are not optimizing the cache behavior and run the program again. Collect the statistics one more time compare the statistics from both the runs. Write a report explaining the differences in the statistics, if any, and the reasons for the differences.

In this lab you will use the Performance API (PAPI) to study

LAB 3:

#include

#include "omp.h"

double** allocate_matrix(int size) {

double * vals = (double *)malloc(size * size * sizeof(double));

double ** ptrs = (double **)malloc(size * sizeof(double*));

int i;

for(i = 0; i

ptrs[i] = &vals[i * size];

}

return ptrs;

}

void assignMat(double **matrix, int size) {

int i, j;

for(i = 0; i

for(j = 0; j

matrix[i][j] = 2.0;

}

void freeMat(double **matrix, int size) {

int i;

for(i = 0; i

free(matrix[i]);

}

free(matrix);

}

void printMat(double **matrix, int size) {

int i, j;

for(i = 0; i

for(j = 0; j

printf("%lf", matrix[i][j]);

}

printf("%lf", matrix[i][j]);

putchar(' ');

}

double** matMult(double **matrix1, double **matrix2, int size) {

int i, j, k;

double **matrix3;

double sum = 0;

#pragma omp parallel for shared(matrix1, matrix2, matrix3, chunksize) private(i, j, k, sum) schedule(static, chunksize)

for(i = 0; i

for(j = 0; j

sum = 0;

for(k = 0; k

sum += matrix1[i][k] * matrix2[k][j];

}

matrix3[i][j] = sum;

}

int main(int argc, char * argv[]) {

double** matrix1;

double** matrix2;

double** matrix3;

int size; // Number of threads. Get from command line

int chunksize, numthreads;

if(argc != 3) {

fprintf(stderr, "%s ", argv[0]);

return -1;

}

size = atoi(argv[1]);

numthreads = atoi(argv[2]);

if(size % numthreads != 0) {

fprintf(stderr, "matrix size %d must be a multiple of number of threads %d! ", size, numthreads);

return -1;

}

omp_set_num_threads(numthreads);

chunksize = size / numthreads;

matrix1 = allocate_matrix(size);

matrix2 = allocate_matrix(size);

matrix3 = allocate_matrix(size);

assignMat(matrix1, size);

assignMat(matrix2, size);

if(size

printMat(matrix1, size);

printMat(matrix2, size);

}

matrix3 = matMult(matrix1, matrix2, size);

printMat(matrix3, size);

freeMat(matrix2, size);

free(matrix1);

free(matrix3);

return 0;

}

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

In this lab you will use the Performance API (PAPI) to study the performance of the matrix multiplication of two matrices B and C of type double. A = B * C A, B and C are all M x M matrices, where M...

For this assignment, you are to write a program to multiply two sparse matrices. You will implement a data structure that facilitates efficient processing of sparse matrices so that your program will...

%% Lab 2 - Your Name - MAT 275 Lab %% Example code % Example 1 % NOTE: Delete examples before submission. A = [1 0; 0 -1] A = [1, 0; 0, -1] % NOTE: The two matrices above are the same. We can...

What is Exercise 32-45? I don't quite understand it, so I'm not sure if my work or answers are correct. Linear Algebra Lab Exercises Linear algebra deals with vectors and linear functions that act on...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

In this lab you will use OMP to parallelize the multiplication of two matrices B and C of type double, A, B and C are all M x M matrices, where M 25. Implement the following functions double**...

Lab 1 Implementing an immutable object to represent a matrix of integers is your first project. You can just as well make the class be of type double if you so choose to. For the class, use three...

Computer system architectures must aim to minimize the gap between computer arithmetic and real - world arithmetic, and programmers need to be aware of the implications of underlying approximations....

( a ) Write a program, using C , C + + , or Java, that multiplies two rectangular matrices - - please no square matrices - whose elements are randomly generated. You may not use a matrix...

What special problems arise when evaluating performance in multinational companies?

A pole vaulter at the Relativistic Olympics sprints past you to do a vault at a speed of 0.65c. When he is at rest, his pole is 7.0 m long. (a) What length do you perceive the pole to be as he passes...

If the contract rate on the bond is 1 3 % and the market interest rate at the time of sale is 1 3 . 2 % , then the bond will sell at: Multiple Choice Cannot be sold in the market Premium Discount Par...

Compared with half a century ago, adoption has become _ _ _ _ _ _ _ _ _ common, but it is more open and acceptabl e , so we probably discuss it _ _ _ _ _ _ _ . fill in the blanks more or much less or...

Explain the difference between Job Analysis, Job Classification, and Job Evaluation.

What does Processing of an OLAP Cube accomplish?

After designing a Multidimensional Database in Visual Studio, what are the next steps that build the Database in the Analysis Services Instance? How is the build out of the Analytical Services...