Question: Add an algorithm to this kernel code that finds the maximum value in the vector it will reduce. global void reduction(float out, float in, unsigned

Add an algorithm to this kernel code that finds the maximum value in the vector it will reduce.

__global__ void reduction(float *out, float *in, unsigned size) {

// INSERT KERNEL CODE HERE

#ifdef SIMPLE __shared__ float in_s[2 * BLOCK_SIZE]; int idx = 2 * blockIdx.x * blockDim.x + threadIdx.x;

in_s[threadIdx.x] = ((idx < size) ? in[idx] : 0.0f); in_s[threadIdx.x + BLOCK_SIZE] = ((idx + BLOCK_SIZE < size) ? in[idx + BLOCK_SIZE] : 0.0f);

for (int stride = 1; stride < BLOCK_SIZE << 1; stride <<= 1) { __syncthreads(); if (threadIdx.x % stride == 0) in_s[2 * threadIdx.x] += in_s[2 * threadIdx.x + stride]; }

#else __shared__ float in_s[BLOCK_SIZE]; int idx = 2 * blockIdx.x * blockDim.x + threadIdx.x;

in_s[threadIdx.x] = ((idx < size) ? in[idx] : 0.0f) + ((idx + BLOCK_SIZE < size) ? in[idx + BLOCK_SIZE] : 0.0f);

for (int stride = BLOCK_SIZE >> 1; stride > 0; stride >>= 1) { __syncthreads(); if (threadIdx.x < stride) in_s[threadIdx.x] += in_s[threadIdx.x + stride]; } #endif

if (threadIdx.x == 0) out[blockIdx.x] = in_s[0]; }

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Optimize the reduction algorithm in kernel.cu so that the max value of the data is obtained. (other related files are provided) kernel.cu #define BLOCK_SIZE 512 #define SIMPLE __global__ void...

T Problem: This program will read in a large US Coast Guard database of ship position information from the Automatic Identification System (AIS)1. The student provided functions will: Prompt the user...

What is the difference between MouseListener and MouseAdapter? [3 marks] (b) Via suitable HTML, the compiled version of the following Java code is presented to the appletviewer application: import...

answer the question clearly You are building a flight-control system for which a convincing safety case must be made. Would you assign the tasks of safety requirements engineering, test case...

Add an algorithm in CUDA C (no c++) to this kernel code that will find the max value from the vector the kernel reduces (float *in). __global__ void reduction(float *out, float *in, unsigned size) {...

Add a kernel function to the existing program using CUDA C so that a histogram of the data loaded in loadData() is created. The name of the .bin file holding the data is set by a command line...

Write a CUDA program to implement the same functionalities as shown in the codes below , perform different experiments, and write a short description on what was made. The CUDA kernel function...

Linux Virtual Machine I need help fixing and implementing my function into my project. I'm trying to make it where you can enter multiple things in the code to get different results. Ex ./stocks.x...

Linux Virtual Machine I need help fixing and implementing my function into my project. Here are the rules Here is the algorithm to use Here is my code so far Stocks.c #include #include #include...

On December 31, 2015 the Clearwater Corporation acquired a custom- made plant asset by issuing a promissory note with a face value of $ 750,000, a due date of December 31, 2020, and a stated (coupon)...

Perform an internet search and identify two payroll service providers. Research the two companies. Compare and contrast the types of services they provide. Include in your research any financial...

7.15 Let X have the probability distribution f(x) = 2(x+1) 9 , 1

Seved Help 14 Wisconsin Snowmobile Corp. is considering a switch to level production Cost efficiencies would occur under level production, and aftertax costs would decline by $31,500, but inventory...

Do you currently have a team agreement?

c. How is trust demonstrated?

c. Will leaders rotate periodically?

Question: Add an algorithm to this kernel code that finds the maximum value in the vector it will reduce. __global__ void reduction(float *out, float *in, unsigned

Step by Step Solution

Students Have Also Explored These Related Databases Questions!

Question: Add an algorithm to this kernel code that finds the maximum value in the vector it will reduce. global void reduction(float out, float in, unsigned