Question: The transpose of a matrix interchanges its rows and columns; this is illustrated below: left [ begin { matrix } A 1 1

The transpose of a matrix interchanges its rows and columns; this is illustrated below:

\

left

[\

begin

{

matrix

}

11

12

13

14 \ \

21

22

23

24 \ \

31

32

33

34 \ \

41

42

43

44 \ \ \

end

{

matrix

} \

right

] \

Longrightarrow

\

left

[\

begin

{

matrix

}

11

21

31

41 \ \

12

22

32

42 \ \

13

23

33

43 \ \

14

24

34

44 \ \ \

end

{

matrix

} \

right

]

Here is a simple C loop to show the transpose:

for

(

= 0

;

1 < 3

; i

+ +) {

for

(

= 0

; j

< 3

; j

+ +) {

output

[

] [

] =

input

[

] [

]

;

}

}

Assume that both the input and output matrices are stored in the row major order

(

row major order means that the row index changes fastest

) .

Assume that you are executing a

256

256

double

-

precision transpose on a processor with a

16

KB fully associative

(

don

'

t worry about cache conflicts; cache just has

1

set

)

least recently used

(

LRU

)

replacement L

1

data cache with

64

byte blocks.

Assume that the L

1

cache misses or prefetches require

16

cycles and always hit in the L

2

cache, and that the L

2

cache can process a request every two processor cycles.

Assume that each iteration of the inner loop above requires four cycles if the data are present in the L

1

cache.

Assume that the cache has a write

-

allocate fetch

-

-

write policy for write misses.

Unrealistically, assume that writing back dirty cache blocks requires

0

cycles.

For the simple implementation given above, this execution order would be nonideal for the input matrix; however, applying a loop interchange optimization would create a nonideal order for the output matrix. Because loop interchange is not sufficient to improve its performance, it must be blocked instead.

What should be the minimum size of the cache to take advantage of blocked execution?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Optimizing Cache Performance via Advanced Techniques Concepts illustrated by this case study Non-blocking Caches Compiler Optimizations for Caches Software and Hardware Prefetching Calculating...

Advanced computer Architecture 2 problem I can't solved book : computer architecture - a quantitative approach 5e Problem 1: (7 points - 3, 4) The transpose of a matrix interchanges its rows and...

132 Hierarchy Design Chapter Two Memory The transpose of a matrix interchanges its rows and columns; thi below All A12 A13 A 14 All A21 A31 A41 A21 A22 A23 A24 A 12 A22 A32 A42 A31 A32 A33 A34 A 13...

c ++ Lab Tasks: Task 1: Find the minimum numeric data in a linear array Declare a linear array and input numeric data from user. Find the minimum numeric data in this array. Task 2: Search the array...

Write C++ program to find the Sum of Array Elements using Pointers. Run the code on the following array and show your results (3, 77, x, 12, 8} where x is the 6 Exercise 2: Write C++ program to add...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

Recall from your mathematics classes that the transpose operation on a matrix exchanges its rows and columns as illustrated below (on a simple 4 x 4 matrix): Here is a simple C loop to show that...

Problem 1: Recall from your mathematics classes that the transpose operation on a matrix exchanges its rows and columns as illustrated below (on a simple 4 x 4 matrix): Here is a simple C loop to...

As you probably know, a system of simultaneous linear equations is equivalent to a matrix-vector problem of the form Ax - b. If the number and nature of the equations is such that A is square and...

"Loeb was demanding that the Japanese firm change its financial and governance policies...to increase its value." First, make a case for why FANUC should follow Loeb's advice. Next, make a case...

You plan to take a random sample from a population of 500 items and build a 95% confidence interval estimate of the population mean. a. You want a margin of error no bigger than 5. You estimate the...

Which of the following statements is / are correct regarding Esmarch and Rhys Davies exsanguinators? I. These should not be used if the patient has thrombosis II . These should not be used on...

Required information Problem 13-5A Comparative ratio analysis LO P3 [The following information applies to the questions displayed below.] Summary Information from the financial statements of two...