Question: Problem 3 Setup Computational science is replete with algorithms that require the entries of an array to be filled in with values that depend on

Problem 3
Setup
Computational science is replete with algorithms that require the entries of an array to be filled in with values that
depend on the values of certain already computed neighboring entries, along with other information that does not
change over the course of the computation. The pattern of neighboring entries does not change during the computation
and is called a stencil. For example, the longest common subsequence algorithm from a previous module, where the
value in entry c[i,j] depends only on the values in c[i-1,j],c[i,j-1], and c[i-1,j-1], as well as the elements
xi and yj within the two sequences given as inputs. The input sequences are fixed, but the algorithm fills in the two-
dimensional array c so that it computes entry c[i,j] after computing all three entries c[i-1,j],c[i,j-1], and
c[i-1,j-1]
This problem examines how to use recursive spawning to parallelize a simple stencil calculation on an nn array A
in which the value placed into entry A[i,j] depends only on values in A[i',j'], where i'i and j'j and of
course, i'i or j'j. In other words, the value in an entry depends only on values in entries that are above it
and/or to its left, along with static information outside of the array. Furthermore, we assume throughout this problem
that once the entries upon which A[i,j] depends have been filled in, the entry A[i,j] can be computed in (1) time.
Partition the nn array A into four n2n2 subarrays as follows:
A=([A11,A12],[A21,A22])
We can immediately fill in subarray A11 recursively, since it does not depends on the entries in the other three
subarrays. Once the computation of A11 finishes, we can fill in A12 and A21 recursively in parallel, because although
they both depend on A11, they do not depend on each other. Finally, we can fill in A22 recursively.
Part A
Give parallel pseudocode that performs this simple stencil calculation using a divide-and-conquer algorithm
SIMPLE-STENCIL based on the setup and the decomposition of A.(Don't worry about the details of the base case,
which depends on the specific stencil.) Give and solve recurrences for the work and span of this algorithm in terms of
n. What is the parallelism?
Part B
Modify your solution to part A to divide an nn array into nine n3n3 subarrays, again recursing with as
much parallelism as possible. Analyze this algorithm. How much more or less parallelism does this algorithm have
compared with the algorithm from part A.
Part C
Generalize your solutions to parts A and B as follows. Choose an integer b2. Divide an nn array into b2
subarrays, each of size nbnb, recursing with as much parallelism as possible. In terms of n and b, what are the
work, span, and parallelism of this algorithm? Argue that, using this approach, the parallelism must be o(n) for any
choice of b2.
TIP
For this argument, show that the exponent of n in the parallelism is strictly less than 1 for any choice of
b2
Part D
Give pseudocode for a parallel algorithm for this simple stencil calculation that achieves (nlogn) parallelism.
Argue using notions of work and span that the problem has (n) inherent parallelism. Unfortunely, simple fork-join
parallelism does not let you achieve this maximal parallelism.
Problem 3 Setup Computational science is replete

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Finance Questions!