Question: 4 . 3 Part B: Optimizing Matrix Transpose In Part B you will write a transpose function in trans.c that causes as few cache misses
Part B: Optimizing Matrix Transpose
In Part B you will write a transpose function in trans.c that causes as few cache misses as possible.
Let A denote a matrix, and Aij denote the component on the ith row and jth column. The transpose of A
denoted AT
is a matrix such that Aij AT
ji
To help you get started, we have given you an example transpose function in trans.c that computes the
transpose of N times M matrix A and stores the results in M times N matrix B:
char transdesc "Simple rowwise scan transpose";
void transint M int N int ANM int BMN
The example transpose function is correct, but it is inefficient because the access pattern results in relatively
many cache misses.
Your job in Part B is to write a similar function, called transposesubmit, that minimizes the number
of cache misses across different sized matrices:
char transposesubmitdesc "Transpose submission";
void transposesubmitint M int N int ANM int BMN;
Do not change the description string Transpose submission for your transposesubmit
function. The autograder searches for this string to determine which transpose function to evaluate for
credit.
Programming Rules for Part B
Include your name and email in the header comment for trans.c
Your code in trans.c must compile without warnings to receive credit.
You are allowed to define at most local variables of type int per transpose function
You are not allowed to sidestep the previous rule by using any variables of type long or by using
any bit tricks to store more than one value to a single variable.
The reason for this restriction is that our testing code is not able to count references to the stack. We want you to limit your
references to the stack and focus on the access patterns of the source and destination arrays.
Your transpose function may not use recursion.
If you choose to use helper functions, you may not have more than local variables on the stack
at a time between your helper functions and your top level transpose function. For example, if your
transpose declares variables, and then you call a function which uses variables, which calls another
function which uses you will have variables on the stack, and you will be in violation of the rule.
Your transpose function may not modify array A You may, however, do whatever you want with the
contents of array B
You are NOT allowed to define any arrays in your code or to use any variant of malloc Evaluation for Part B
For Part B we will evaluate the correctness and performance of your transposesubmit function on
three differentsized output matrices:
times M N
times M N
times M N We have provided you with an autograding program, called testtrans.c that tests the correctness and
performance of each of the transpose functions that you have registered with the autograder.
You can register up to versions of the transpose function in your trans.c file. Each transpose version
has the following form:
Header comment
char transsimpledesc "A simple transpose";
void transsimpleint M int N int ANM int BMN
your transpose code here
Register a particular transpose function with the autograder by making a call of the form:
registerTransFunctiontranssimple, transsimpledesc;
in the registerFunctions routine in trans.c At runtime, the autograder will evaluate each reg
istered transpose function and print the results. Of course, one of the registered functions must be the
transposesubmit function that you are submitting for credit:
registerTransFunctiontransposesubmit, transposesubmitdesc;
See the default trans.c function for an example of how this works.
The autograder takes the matrix size as input. It uses valgrind to generate a trace of each registered trans
pose function. It then evaluates each trace by running the reference simulator on a cache with parameters
s E b
For example, to test your registered transpose functions on a times matrix, rebuild testtrans, and
then run it with the appropriate values for M and N:
linux make
linuxtesttrans M N
Step : Evaluating registered transpose funcs for correctness:
func Transpose submission: correctness:
func Simple rowwise scan transpose: correctness:
func columnwise scan transpose: correctness:
func using a zigzag access pattern: correctness:
Step : Generating memory traces for registered transpose funcs.
Step : Evaluating performance of registered transpose funcs s E b
func Transpose submission: hits: misses: evictions:
func Simple rowwise scan transpose: hits: misses: evictions:
func columnwise scan transpose: hits: misses: evictions:
func using a zigzag access pattern: hits: misses: evictions:
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
