Question: Parallel programming lab: In this lab, you will be writing a program to approximate the value of the mathematical constant Pi, typically known by its

Parallel programming lab:

In this lab, you will be writing a program to approximate the value of the mathematical constant Pi, typically known by its mathematical symbol . By definition, Pi is the ratio of the circumference of a circle to its diameter. This ratio remains the same regardless of the size of the circle. Moreover, Pi is an irrational number, which means it is a real number with a nonrepeating decimal expansion. In other words, it is not representable as an integer ratio and the decimal point goes forever.

One of the simplest methods to calculate Pi accurately to a great number of decimal places is the Gregory-Leibniz series. Each time a new term is added the result gets closer and closer to Pi. The series is represented as follow. PI = 4 [ 1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + ...] With the help of a computer, a large number of terms can be used in estimating Pi. The actual value of Pi up to 20 digits is 3.14159265358979323846. The table below is generated from a serial implementation of the Gregory-Leibniz series. Notice that as more terms are added to the series, the approximation of Pi approaches the actual value of Pi, but the computation time increases noticeably. Using Gregory-Leibniz series, you will construct a data parallel solution. In other words, you will divide the input data between the processors, and each processor will compute its part or the answer. Fortunately, once the data is partitioned among the processors, the processors can compute their terms independently without the need to share any part of their answers until the very end of the computation. Such computations are known as embarrassingly parallel and are typically much easier to program than computations that need to share data (like the sums we have done in class).

To accomplish this we need a line of OpenMP which we have not learned yet. This line is is:

#pragma omp parallel num_threads(thread_count)

With this line, thread_count is equal to the number of cores you wish to run the program on. Note that data decomposition is needed to so that every thread can get equal number of terms. This can be done by dividing the number of terms by the number of threads. In the case when the number of terms in the series are not evenly distributed, then either first or last thread is assigned some additional terms. Consider the following code block.

int num_parts = total_terms/thread_count; int my_first_i = num_parts * my_rank; int my_last_i; if (my_rank == thread_count-1) my_last_i = total_terms; else my_last_i = my_first_i + num_parts;

The my_rank variable can be found by using the OMP function omp_get_thread_num(). At this point, all the threads know their range of terms (num_parts) to sum in the Gregory-Leibniz series. The variable my_first_i is the index of the first term and my_last_i 1 is the index of the last term assigned to a thread. The last thread gets few additional terms in case when the number terms (total_terms) are not evenly divided by the number of threads (thread_count). In the Gregory-Leibniz series, terms with an odd index are negative. Therefore, the first term of each thread is checked whether it lies in the even or odd position. The odd position terms are multiplied by negative one. An example is shown as below.

double factor; if (my_first_i%2 == 0.0) factor = 1.0; else factor = -1.0;

Each thread now calculates a partial sum using their assigned range of terms by the below code block. Each thread maintains a local copy of my_sum variable containing its partial sum of the series. Consider the following example.

double my_sum = 0.0; for(i = my_first_i ; i my_last_i ; i++, factor = -factor){ my_sum += factor/(2.0 * i + 1.0); }

After calculating the partial sum, each thread will want to update the global sum with their local sum. If all the threads add their local sum simultaneously to the global sum then the resultant value will be erroneous. Therefore, a synchronization method needs to be used to ensure there is only one update of the global sum by a thread at a time. OpenMP critical/atomic directive can be used to do that.

#pragma omp critical sum += my_sum;

(Actual Assignment)

Code a program to calculate the value of pi using the Gregory-Leibniz method. I want you to #include the files below which contain functions allowing you to time the execution of your program. Run the program 5 times with the thread_count variable set to 1. This is equivalent to running the program in sequential mode. Record the value of Pi you observe and the time it took to execute. Run the program 5 times with the thread_count variable set to 2. Record the value of Pi you observe and the time it took to execute. Run the program 5 times with the thread_count variable set to 4. Record the value of Pi you observe and the time it took to execute.

(Provided Header File)

#if !defined TIMER_H #define TIMER_H

#include #include //Returns the current time //Pre: None //Post: Returns the current time. time_t getTime ();

//Returns the difference between two variables of type time_h //Pre: Start is a time variable that occurs before end //Post: returns the amount of elapsed time between start and end. double totalTime (time_t start, time_t end);

#endif

(Provided .cpp File, I assume the starting point with appropriate function prototypes and definitions)

#include "Timer.h" time_t getTime () { return time (NULL); } double totalTime (time_t start, time_t end) { return difftime (end, start); }

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

re Regular Languages and Finite Automata (a) Let L be the set of all strings over the alphabet {a, b} that end in a and do not contain the substring bb. Describe a deterministic finite automaton...

Briefly describe ASCII and Unicode and draw attention to any relationship between them. [3 marks] (b) Briefly explain what a Reader is in the context of reading characters from data. [3 marks] A...

Microkernel operating systems aim to address perceived modularity and reliability issues in traditional "monolithic" operating systems. (i) Describe the typical architecture of a microkernel...

(a) In SystemVerilog, what is the difference between: (i) The ternary operator ? and if...then...else statements? [2 marks] (ii) always_ff and always_comb? [2 marks] (iii) Blocking, non-blocking and...

A creative engineer suggests structuring the TLB so that not all the bits of the presented address need match to result in a hit. Suggest how this might be achieved, and what might be the costs and...

(i) Write down the linear program relaxation for the vertex cover problem and solve the linear program. [6 marks] (ii) Based on the solution of the linear program in (b)(i), derive an integer...

Let A, B be sets. Define: (a) the Cartesian product (A B) (b) the set of relations R between A and B (c) the identity relation A on the set A [3 marks] Suppose S, T are relations between A and B, and...

What is Exercise 32-45? I don't quite understand it, so I'm not sure if my work or answers are correct. Linear Algebra Lab Exercises Linear algebra deals with vectors and linear functions that act on...

Why the Monte Carlo Method is so important today Article ID Dirk P. Kroese The University of Queensland Tim Brereton Ulm University Thomas Taimre The University of Queensland Zdravko I. Botev The...

What is the significant difference between a bondholder and a preferred shareholder, both of whom are entitled to a specified payment each year?

A cantilever beam of length L = 2m supports a load P = 8.0kN (see figure). The beam is made of wood with cross-sectional dimensions 120mm x 200mm. Calculate the shear stresses due to the load at...

Consider the following loan. Complete parts (a)-(c) below. An individual borrowed $74 comma 000 at an APR of 7%, which will be paid off with monthly payments of $603 for 18 years. Question content...

The parents of a twelve-year-old boy want to make quarterly payments into a college savings fund. If they want to have $39,866 saved in six years and are paying into an account that returns 3.5%...

What is inflation targeting, and how does it differ from artful management? What are the main benefits of inflation targeting, according to its supporters? Why do many economists think it is...

KEY QUESTION What is the basic determinant of ( a ) the transactions demand and ( b ) the asset demand for money? Explain how these two demands can be combined graphically to determine total money...

KEY QUESTION Assume that the following data characterize a hypothetical economy: money supply $200 billion; quantity of money demanded for transactions $150 billion; quantity of money demanded as...