Consider the following CUDA kernel and the corresponding host function that calls it void foo kernel ( int a , int b ) unsigned int i blockIdx x blockDim x threadIdx x if ( threadIdx x 4 0 threadIdx x 1 0 4 b i a i 1 if a i b i 2 for ( unsigned int b i j void foo ( int a d , int b d ) unsigned int N 1 0 2 4 b a r ( ) foo kernel a d , b d ) a What is the number of warps per block b What is the number of warps in the grid c For the statement on line 0 4 i How many warps in the grid are active ii How many warps in the grid are divergent iii What is the SIMD efficiency ( in ) of warp 0 of block 0 iv What is the SIMD efficiency ( in ) of warp 1 of block 0 v What is the SIMD efficiency ( in ) of warp 3 of block 0 d For the statement on line 0 7 i How many warps in the grid are active ii How many warps in the grid are divergent iii What is the SIMD efficiency ( in ) of warp 0 of block 0 e For the loop on line 0 9 i How many iterations have no divergence ii How many iterations have divergence

The Answer is in the image, click to view ...

Question: Consider the following CUDA kernel and the corresponding host function that calls it: void foo _ kernel ( int * a , int * b

Consider the following CUDA kernel and the corresponding host function that

calls it:

void foo

_

kernel

(

int

*

,

int

*

) {

unsigned int

i =

blockIdx.x

*

blockDim

.

+

threadIdx.x;

(

threadIdx

.

40||

threadIdx.x

104

b [i] = a [i] + 1

;

}

a [i] = b [i] * * 2

;

}

for

(

unsigned int

b [i] + = j

;

}

}

void foo

(

int

*

_

,

int

*

_

) {

unsigned int

N = \frac{1024}{b} a r (

;

)

foo

_

kernel a

_

,

_

)

;

}

.

What is the number of warps per block?

.

What is the number of warps in the grid?

.

For the statement on line

04

.

How many warps in the grid are active?

.

How many warps in the grid are divergent?

iii. What is the SIMD efficiency

(

%)

of warp

0

of block

0 ?

.

What is the SIMD efficiency

(

%)

of warp

1

of block

0 ?

.

What is the SIMD efficiency

(

%)

of warp

3

of block

0 ?

.

For the statement on line

07

.

How many warps in the grid are active?

.

How many warps in the grid are divergent?

iii. What is the SIMD efficiency

(

%)

of warp

0

of block

0 ?

.

For the loop on line

09

.

How many iterations have no divergence?

.

How many iterations have divergence?

Consider the following CUDA kernel and the corresponding host function that

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Consider the following CUDA kernel and the corresponding host function that calls it: 0 1 _ global _ void foo _ kernel ( float * a , float * b , unsigned int M , unsigned int N ) 0 2 unsigned int row...

Consider the following CUDA kernel and the corresponding host function that calls it: 0 1 void foo _ kernel ( float * a , float * b , unsigned int N , unsigned int N ) 0 2 unsigned int row =...

9 . Consider the following CUDA kernel and the corresponding host function that calls it: 0 1 N ) { 0 2 x; 0 3 0 4 0 5 0 6 0 7 0 8 0 9 b _ d , N ) ; 1 0 _ _ global _ _ void foo _ kernel ( float a ,...

Consider the following CUDA kernel and the corresponding host function that calls it: 0 1 global _ vold foo _ kernel ( tloat * a , float * b , unsigned int M , unsigned int NT / 0 2 unsigned int row...

Consider the following CUDA kernel and the corresponding host function that calls it: 0 1 _ global _ _ void foo _ kernel ( floatb , unsigned int N ) 0 2 unsigned int j = blockIdx. x * * blockDim. x +...

1. Consider the following function called duplicate and the main function that calls it. What are the values for x, y and z that are displayed? void duplicate (int& a, int& b, int c) { a*=2; b*=2;...

I need help coding these functions in C programming. The output example is described below: In this assignment you are to code each of the functions specified below. You will then need to test them...

We would like to launch a matrix multiplication kernel to multiply an 1 0 0 * 1 0 0 matrix A with a 1 0 0 * 1 0 0 matrix B with the simple matrix multiplication kernel using 1 6 * 1 6 thread blocks....

Task Description: In this assignment, you are tasked with developing a complete CUDA C C + + program for an image blur application, also known as image smoothing that we learned in Module 3...

computer science your input image is named "santa - grayscale.jpg " , the execution command would be " / / convolution santa - grayscale.jpg " No command argument for filter radius is necessary in...

VV Ventures is a VC firm that is considering an investment in Start-Up, Inc (SU). VV is projecting SU's annual sales to grow from the current level of $8,000,000 to $52,000,000 over the next 3 years....

Cheyenne Company had an investment which cost $ 300000 and had a salvage value at the end of its useful life of zero. If Mussina's expected annual net income is $ 15000 , the annual rate of return...

An example of a petty offense would be: Multiple Choice embezzlement violating a bullding code fraud murder

People arrange values into a hierarchy of preferences called a value system. Select one: a . True b . False

I didnt want to spoil the mood of the party. I wasnt the host, so I didnt want to make a fuss. I was polite at the dinner table but grumbled in the washroom.

Which staff members regularly spend time with the customers? How is the information they gather channeled back into the organization?

Th eir solution was to give me a long-distance number to call.