Question: This is a High Performance Computing question: 1. Puff is a student working on a GPU research project. In this project, he is implementing an

This is a High Performance Computing question:

1. Puff is a student working on a GPU research project. In this project, he is implementing an algorithm for Finite Element modeling using CUDA. His program is slow so he uses a profiler to analyze performance. Help Puff figure out how to improve his code by answering the questions below. Assume that the size of the matrices is in the millions of elements and is not multiples of powers of 2, and the device used has compute capability 3.7.

The first thing Puff has noticed is low occupancy. Without knowing details of his implementation, the kernel, indicate whether each of the following factors could affect this metric(answer yes or not) and briefly explain why:

a.Using as many registers per thread as possible

b. Explicitly making use of GPU L2 cache

c. Having a high rate of execution/memory dependencies

d. Trying to coalesce memory accesses

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!