Question: Compare the performance of a vector processor with a hybrid system that contains a scalar processor and a GPU - based coprocessor. In the hybrid
Compare the performance of a vector processor with a hybrid system that contains a scalar processor and a GPUbased coprocessor. In the hybrid system, the host processor has superior scalar performance to the GPU, so in this case all scalar code is executed on the host processor sor while all vector code is executed on the GPU. We will refer to the first system as the vector computer and the second system as the hybrid computer. Assume that your target application contains a vector kernel with an arithmetic intensity of FLOPS per DRAM byte accessed; however, the application also has a scalar component which that must be performed before and after the kernel in order to prepare the input vectors and output vectors, respectively. For a sample dataset, the scalar portion of the code requires ms of execution time on both the vector processor and the host processor in the hybrid system. The kernel reads input vectors consisting of MB of data and has output data consisting of MB of data. The vector processor has a peak memory bandwidth of GBsec and the GPU has a peak memory bandwidth of GBsec The hybrid system has an additional overhead that requires all input vectors to be transferred between the host memory and GPU local memory before and after the kernel is invoked. The hybrid system has a direct memory access DMA bandwidth of GBsec and an average latency of ms Assume that both the vector processor and GPU are performance bound by memory bandwidth. Compute the execution time required by both computers for this application.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
