Question: I am requesting the following code in CUDA with C++. Instructions are first. Your program must be a CUDA program. When executed, it should launch

I am requesting the following code in CUDA with C++. Instructions are first.

Your program must be a CUDA program. When executed, it should launch a grid of 256 blocks, with 256 threads in each block. However, you cannot assume this knowledge in your kernel. That is, you must calculate the total number of threads by using gridDim and blockDim. You MUST use constant memory to store the convolution mask. The declaration and allocation of the constant memory has been provided (and you are not allowed to change it). Use of shared memory to cut down global memory access bandwidth consumption is encouraged but not required. Both the input and output files have the same format. The first line contains a single integer n indicating how many data values are in the file. This is followed by n lines each contains a single floating-point value (in %.3f format). You do not have to worry about file handling as the main function will be handling all these tasks. You can further assume that the input file format is correct and no need to perform any error checking. To give you an idea on how the program should work, a sequential version of the program (sAverage.cu) is provided. You can compile this program using the following command: nvcc o sAverage sAverage.cu You should name your program pAverage.cu, and you should be able to compile it using the following command: nvcc o pAverage pAverage.cu For your convenience, a skeleton file has been provided with all supporting code in place. You can run the sequential and parallel program as follow: ./sAverage input.txt sOutput.txt ./pAverage input.txt pOutput.txt Note that due to a slight difference in the way that the CPU and the GPU implement floating point values. The output of the GPU program will likely not be identical to the CPU output. However, the discrepancy should only be occasionally, with a different no more than 0.001. Submission Requirement:

#include #include

#define MAX_MASK_SIZE 10

__constant__ float MASK [MAX_MASK_SIZE];

__global__ void average_kernel (float *output, float *input, int input_size, int mask_size) { /******************/ /* Your code here */ /******************/

/* 1. calculate thread id, and use it as index to output */ /* 2. calculate number of threads */ /* 3. while index < input_size */ /* 4. initialize a running total */ /* 5. calculate start index */ /* 6. perform convolution with a for loop */ /* 7. write running total to output */ /* 8. increment index appropriately */

}

void process_data (float *output, float *input, float *mask, int input_size, int mask_size) { /******************/ /* Your code here */ /******************/

/* 1. declare device memory */ /* 2. allocate device memory */ /* 3. copy input data into device memory */ /* 4. copy mask into constant memory */ /* 5. invoke kernel */ /* 6. copy output from device memory */ /* 7. deallocate device memory */ }

int main (int argc, char **argv) { FILE *infile; FILE *outfile; float *input; float *output; float mask [] = {0.1, 0.2, 0.3, 0.4}; int i; int n; if (argc < 3) { fprintf (stderr, "Usage: %s ", argv [0]); exit (1); } infile = fopen (argv [1], "r"); if (infile == NULL) { fprintf (stderr, "Error: cannot open input file [%s]. ", argv [1]); exit (1); } fscanf (infile, "%d", &n); input = (float *) malloc (n * sizeof (float)); for (i = 0; i < n; i++) { fscanf (infile, "%f", &(input [i])); } fclose (infile); output = (float *) malloc (n * sizeof (float)); process_data (output, input, mask, n, 4); outfile = fopen (argv [2], "w"); fprintf (outfile, "%d ", n); for (i = 0; i < n; i++) { fprintf (outfile, "%.3f ", output [i]); } fclose (outfile); free (input); free (output); return 0; }

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!