Question: computer science question Task Description:In this assignment, you are tasked with developing a complete CUDA CIC + + t program for an imageblur application, also
computer science question
Task Description:In this assignment, you are tasked with developing a complete CUDA CICt program for an imageblur application, also known as image smoothing that we learned in Module "MultidimensionalandConvolution serves as the fundamental operation for implementing the image blur process.Specially, one of the convolution kemels requested in this assignment should be optimized usingthe tiled convolution technique that we learned in Module "Convolution". This optimizationinvolves leveraging GPU shared memory and constant memory to enhance performance.Below are the specific requirements: Develop a single CUDA program file named "convolution.eu containing all thenecessary code to blur an input image and generate its blurred image. For simplicity, wewill use the average filter of size x in this assignment, where each filter element holdsthe floatingpoint value It is highly recommended to utilize the NCSA Delta GPUs for this assignment. However,ifyoure experienced in successfully building and installing OpenCV for C from sources,you may proceed with your own computers.Below are the commandlines for compiling and executing your program using NCSADelta GPUs: Log in to the NCSA supercomputer using your own NCSA account.Enter an interactive session with GPUs, for example:srun accountbchndeltagpu partitiongpuAxinteractive nodesnodetaskstaskspernodecpuspertaskmemgLoad the OpenCV module:module load opencvxTo compile:nvce o convolution convolution.cu I SOPENCV HOMEncludeopencvLSOPENCV HOMEIblopeney imgcodecsnprocJconvolution inputImgjpgWhen running your program,lopency core Please make sure to replace "inputlmgjpg" with the name of your input image file,which should be located in the ame directory as your program. For example, ifyour input image is named santagrayscale. jpg the execution command would beJconvolution santagrayscale.jpg No command argument for filter radius is necessary in this assignment, as we utilizethe constant average filter of size Sx where the filter radius is Within "convolution.cu implement one host function and two CUDA kernels to performconvolution, respectively. Specifically,a A host function that performs the convolution operation using CPUonly.b A CUDA kernel that performs the convolution operation using GPU but withouttiling. You may refer to the example provided on Slide in Module but pleasemodifyh Pe average mter matrix. It is acceptable to load theC. An optimized CUDA kernel that presents a "tiled" version of the convolutionoperation using GPU shared memory. Specifically, in this optimized kernel,please load the average filter from GPU constant memory Ensure that your code can handle images with varying image sizes. Please also considerboundary conditions to ensure proper handling in such cases.For testing purposes, two input images of different sizes are provided in the zipped folderaccompanying this assignment."santagrayscale. jpg: a grayscale image of size x Utilize"treegrayscale.jpg: a grayscale image of size x You may also choose to test your program with additional grayscale images if desired.CUDAe iming techniques such as CPU timersof your implementation of the #define CHECKcallWe also suggest structuring the "convolution.cu by implementing the following macros, hostfunctions, and CUDA kernels. At the cnd of this assignment, we will provide screenshots of anexample of the program's structure.o A macro for error checking.double myCPUTimervents to measure theo A timer for measuring execution time.void blurlmage hev::Mat Pout Mat h cv::Mat Pin Mat h unsigned int nRows,unsigned int nColsaso A host function for CPUonly convolution.globalvoid blurlmageKernelunsigned char Pout, unsigned char Pin,unsigned int width, unsigned int heighto A CUDA kernel performs a simple convolution without using tiling.void void blurlmage dcv::Mat Pout Mat h cv::Mat Pin Mat h unsigned intnRows, unsigned int nColshost function andling device device memory allocation and free, data copy, anding the specific CUDA kernel, blurlmage KernclO.e host function and two CUDA kernelsPspecified above.Given a string s find the length of the longest substring without repeating characters.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
