Question: In this task, you will expand your implementation from Matrix - Vector multiplication ( Question 1 ) to Matrix - Matrix multiplication using MPI. Each

In this task, you will expand your implementation from Matrix-Vector multiplication (Question 1) to Matrix-Matrix multiplication using MPI. Each process will be responsible for computing a portion of the product of two large matrices and , resulting in matrix =.
Requirements:
1.
Block Matrix Partitioning:
o
Divide matrices and into blocks that can be distributed across multiple processes. This method should allow the matrices to be processed in parallel, where each process receives a block of rows and columns to compute part of the resulting matrix .
o
Ensure that blocks of both matrices are distributed dynamically to prevent idle processes and to ensure balanced load distribution.
2.
Dynamic Load Balancing:
o
Similar to Question 1, processes should request additional blocks to process as they complete their assigned work. The master process should manage the assignment of new blocks to ensure efficient load balancing across all processes.
o
Unlike static partitioning, where matrix blocks are predetermined, this dynamic load balancing approach should ensure that faster processes can continue working without waiting for others to finish.
3.
Gathering the Result:
o
Each process should compute its portion of the matrix product and send the results back to the master process. The master process will be responsible for gathering the partial results and assembling the full result matrix .
Performance Testing:
As with the Matrix-Vector multiplication, run your Matrix-Matrix multiplication program on the Rushmore cluster (4 virtual machines) and compare the performance for different matrix sizes and process counts.
Use 3 different matrix sizes: 10001000,50005000,1000010000
Compare the performance using 4,8, and 16 processes, using the same setup:
o
4 processes: Run one process per virtual machine.
o
8 processes: Run two processes per virtual machine.
o
16 processes: Run four processes per virtual machine.
Analyze the computation time and reflect on the performance differences between Matrix-Vector and Matrix-Matrix multiplication, particularly focusing on how running multiple processes on a single machine impacts the overall efficiency.
Deliverables:
1.
Source Code: Submit your source code with detailed comments explaining your block matrix partitioning strategy, how you implement dynamic load balancing, and the communication strategy used to distribute and gather blocks. Include the README.txt file.
2.
Report:
o
Comparison: Provide a detailed comparison between the performance of your Matrix-Vector multiplication implementation (from Question 1) and your Matrix-Matrix multiplication implementation.
o
Performance Analysis: Analyze the performance of Matrix-Matrix multiplication for different numbers of processes (4,8,16) and varying matrix sizes. Present your results (table and graph), highlighting any speedups or bottlenecks you observe.
o Reflection: Reflect on the challenges you faced when implementing dynamic load balancing in Matrix-Matrix multiplication, particularly in comparison to Matrix-Vector multiplication. Discuss any performance trade-offs, challenges of block partitioning, and how the complexity of communication patterns has increased. In this task, you will expand your implementation from Matrix-Vector multiplication (Question 1) to Matric-Matrib multiplication using MPI. Each process will be responsible for computing a portion of the product of two large matrioes \( A \) and \( B \), resulting in matrix \( C=\)\( A \times B \).
Requirements:
1. Bloek Matrix Partitioning
- Divide matrioes \( A \) and \( B \) into blocks that can be distributed across multiple processes. This method should allow the matrices to be processed in parallel, where each process reoeives a block of rows and columns to compute part of the resulting matrix \( C \).
- Ensure that blocks of both matrices are distributed dynamically to prevent idle processes and to ensure balanced load distribution.
2. Dynamie Lodd Balaneing:
- Similar to Question 1, processes should request additional blocks to process as they complete their assigned work. The master process should manage the assignment of new blocks to ensure efficient load balancing across all processes.
- Unlike static partitioning. where matrix blocks are predetermined, this dynamic load balancing approach should ensure that faster processes can continue working without waiting for others to finish.
3. Gathering the Result:
- Each process should compute its portion of the matric product and send the results back to the master process. The master process will be responsible for gathering the partiol results and assembling the full result matrix \( C \).
Performance Testing:
- As with the Matrix-Vector multiplication, run your Matric-Matrix multiplication program on the Rushmore cluster (4 virtual machines) and compare the performance for different matrix silizes and process counts.
- Use
- Cemparison: Provide a detailed comparison between the
In this task, you will expand your implementation

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!