Question: [ 1 0 / 1 5 ] < 4 . 4 > Assume a GPU architecture that contains 1 0 SIMD processors. Each SIMD instruction

[10 / 15] < 4.4 >

Assume a GPU architecture that contains

10

SIMD processors. Each SIMD instruction has a width of

32

and each SIMD processor contains

8

lanes for single

-

precision arithmetic and load

/

store instructions, meaning that each nondiverged SIMD instruction can produce

32

results every

4

cycles. Assume a kernel that has divergent branches that causes, on average,

80 %

of threads to be active. Assume that

70 %

of all SIMD instructions executed are single

-

precision arithmetic and

20 %

are load

/

store

.

Because not all memory latencies are covered, assume an average SIMD instruction issue rate of

0.85 .

Assume that the GPU has a clock speed of

1.5

GHz

.

. [10] < 4.4 >

Compute the throughput, in GFLOP

/

,

for this kernel on this GPU.

. [15] < 4.4 >

Assume that you have the following choices:

(1)

Increasing the number of single

-

precision lanes to

16

(2)

Increasing the number of SIMD processors to

15 (

assume this change doesn

t affect any other performance metrics and that the code scales to the additional processors

)

(3)

Adding a cache that will effectively reduce memory latency by

40 %,

which will increase instruction issue rate to

0.95

What is speedup in throughput for each of these improvements?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Problem #7: Assume a GPU architecture that contains 10 SIMD processors. Each SIMD instruction has a width of 32 and each SIMD processor contains 8 lanes for single-precision arithmetic and load/store...

Assume a GPU architecture that contains 10 SIMD processors. Each SIMD instruction has a width of 32 and each SIMD processor contains 8 lanes for single-precision arithmetic and load/store...

Assume a GPU architecture that contains 1 0 SIMD processors. Each SIMD instruction has a width of 3 2 and each SIMD processor contains 8 lanes for single - precision arithmetic and load / store...

You will write a program that will take an input file that describes the land, the starting point, potential lodges we can relax in, and some other characteristics for our skier. For each lodge...

Which solution is adopted by Ethernet and what measures are taken to ensure stability in circumstances of high load? [4 marks] 1 [TURN OVER CST.93.5.2 4 Graphics I A certain image contains a number Q...

can someone solve this Modern workstations typically have memory systems that incorporate two or three levels of caching. Explain why they are designed like this. [4 marks] In order to investigate...

MUST BE CORRECT ANSWERS A small software company has the following simplified cashflow, funded by shareholders' equity of 20,000 and a bank overdraft of 5000: Invoiced money received 2 months after...

Question: Check you are in charge of the design of both hardware and software for a new (but fairly conventional) workstation which will have its peripherals (for example a disc drive and a printer)...

Data gistrun sending inside a pack fills in as conventional allowing subordinate rules to execute on progressive clock cycles. Correspondence between gatherings conventionally causes a surprising...

THE REQUIREMENTS DOCUMENTS ARE AS FOLLOWS: PLEASE PROVIDE OP CODE TABLE & MACHINE CYCLE DIAGRAM IN EXCEL OR PDF Either Word, Visio, PDF, or any acceptable format that clearly shows a block diagram of...

If 2 million worth of products are sold in year 1 the products carry a 2-year warranty and management estimates warranty costs will be 3 of sales in year 1 and 4 of sales in year 2 then warranty...

In Problem 12.10, you were asked to develop the equation of a regression model to predict the number of business bankruptcies by the number of firm births. For this regression model, solve for the...

The current tax rate for social security is 6 . 2 % for the employer and 6 . 2 % for the employee, or 1 2 . 4 % total. The current rate for Medicare is 1 . 4 5 % for the employer and 1 . 4 5 % for...

Mazzi Corp. shares are currently trading at RM14.00 each. A three-month put option on this share is priced RM0.50 at a strike price of RM15.00. The risk free interest is 10% per annum for all...