Question: Task Description: In this assignment, you are tasked with developing a complete CUDA C C + + program for an image blur application, also known

Task Description:
In this assignment, you are tasked with developing a complete CUDA CC++ program for an image
blur application, also known as image smoothing that we learned in Module 3 "Multidimensional
Grids and Data".
Convolution serves as the fundamental operation for implementing the image blur process.
Specially, one of the convolution kernels requested in this assignment should be optimized using
the tiled convolution technique that we learned in Module 7 "Convolution". This optimization
involves leveraging GPU shared memory and constant memory to enhance performance.
Below are the specific requirements:
Develop a single CUDA program file named "
convolution.cu" containing all the
necessary code to blur an input image and generate its blurred image. For simplicity, we
will use the average filter of size 55 in this assignment, where each filter element holds
the floating-point value 125
It is highly recommended to utilize the NCSA Delta GPUs for this assignment. However,
if you're experienced in successfully building and installing OpenCV for C++ from sources,
you may proceed with your own computers.
Below are the command-lines for compiling and executing your program using NCSA
Delta GPUs:
Log in to the NCSA supercomputer using your own NCSA account.
Enter an interactive session with GPUs, for example:
srun --account=bchn-delta-gpu -partition=gpuA40x4-interactive -nodes=1-
gpus-per-node =1--tasks=1- tasks-per-node =16- cpus-per-task =1--mem=20g
-pty bash
Load the OpenCV module:
module load opencv/4.9.0.x8664
To compile:
nvce -o convolution
convolution.cu -I SOPENCV_HOME/include/opencv4/-L
SOPENCV_HOME/lib64-lopencv core -lopencv imgcodecs
lopencv imgproc
To execute:
/convolution inputImgipg
When running your program,
Please make sure to replace "inputImg.jpg" with the name of your input image file,
which should be located in the same directory as your program. For example, if
your input image is named "santa-grayscale.jpg", the execution command would be
"./convolution santa-grayscale.jpg"
No command argument for filter radius is necessary in this assignment, as we utilize
the constant average filter of size 55 where the filter radius is 2.
Within "
convolution.cu", implement one host function and two CUDA kernels to perform
convolution, respectively. Specifically,
a. A host function that performs the convolution operation using CPU-only.
b. A CUDA kernel that performs the convolution operation using GPU but without
tiling. You may refer to the example provided on Slide 40 in Module 3, but please
modify it to incorporate the average filter matrix. It is acceptable to load the
average filter from either GPU global memory or constant memory.
c. An optimized CUDA kernel that presents a "tiled" version of the convolution
operation using GPU shared memory. Specifically, in this optimized kernel,
please load the average filter from GPU constant memory.
Ensure that your code can handle images with varying image sizes. Please also consider
boundary conditions to ensure proper handling in such cases.
For testing purposes, two input images of different sizes are provided in the zipped folder
accompanying this assignment.
"santa-grayscale.jpg": a grayscale image of size 1,0001,000
"tree-grayscale.jpg": a grayscale image of size 345346
You may also choose to test your program with additional grayscale images if desired.
Utilize timing techniques such as CPU timers or CUDA events, to measure the
performance of your implementation of the host function and two CUDA kernels as
specified above.
We also suggest structuring the "
convolution.cu" by implementing the following macros, host
functions, and CUDA kernels. At the end of this assignment, we will provide screenshots of an
example of the program's structure.
#define CHECK(call)
A macro for error checking.
double myCPUTimer(0)
A timer for measuring execution time.
void blurImage_h(cv::Mat Pout_Mat_h, cv::Mat Pin_Mat_h, unsigned int nRows,
unsigned int n Cols)
A host function for CPU-only convolution.
_ void blurImage_Kernel(unsigned char * Pout, unsigned char * Pin,
unsigned int width, unsigned int height)
A CUDA kernel performs a simple convolution without using tiling.
void void blurImage_d(cv::Mat Pout_Mat_h, cv::Mat Pin_Mat_h, unsigned int
nRows, unsigned int nCols)
A host function for handling device memory allocation and free, data copy, and
calling the specific CUDA kernel, blurImage_Kernel().
Task Description: In this assignment, you are

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!