Skip to content

Commit

Permalink
Merge pull request #99 from uvarc/staging
Browse files Browse the repository at this point in the history
Added back a file lost from parallel tutorial
  • Loading branch information
kah3f authored Mar 13, 2024
2 parents fd7f983 + e96acee commit 2b4f01a
Show file tree
Hide file tree
Showing 3 changed files with 121 additions and 16 deletions.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Getting Started with MPI"
title: "Basics of MPI Programming"
toc: true
type: docs
weight: 25
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
---
title: "Message Buffers"
toc: true
type: docs
weight: 23
menu:
parallel_programming:
parent: Distributed-Memory Programming
---

MPI documentation refers to "send buffers" and "receive buffers." These refer to _variables_ in the program whose contents are to be sent or received. These variables must be set up by the programmer. The send and receive buffers cannot be the same unless the special "receive buffer" `MPI_IN_PLACE` is specified.

When a buffer is specified, the MPI library will look at the starting point in memory (the pointer to the variable). From other information in the command, it will compute the number of bytes to be sent or received. It will then set up a separate location in memory; this is the actual buffer. Often the buffer is not the same size as the original data since it is just used for streaming within the network. In any case, the application programmer need not be concerned about the details of the buffers and should just regard them as _variables_.

For the send buffer, MPI will copy the sequence of bytes into the buffer and send them over the appropriate network interface to the receiver. The receiver will acquire the stream of data into its receive buffer and copy them into the variable specified in the program.

### Buffer Datatypes

MPI supports most of the _primitive_ datatypes available in the target programming language, as well as a few others.

In every language, it is _imperative_ that the data types in the send and receive buffers match. If they do not, the result can be anything from garbage to a segmentation violation.

#### C/C++

MPI supports most C/C++ datatypes as well as some extensions. The most commonly used are listed below.

{{< table >}}
| C/C++ type | MPI_Datatype |
|----------------|----------------|
| int | MPI_INT |
| short | MPI_SHORT |
| long | MPI_LONG |
| long long | MPI_LONG_LONG_INT |
| unsigned int | MPI_UNSIGNED |
| unsigned short | MPI_UNSIGNED_SHORT |
| unsigned long | MPI_UNSIGNED_LONG |
| unsigned long long | MPI_UNSIGNED_LONG_LONG |
| float | MPI_FLOAT |
| double | MPI_DOUBLE |
| long double | MPI_LONG_DOUBLE |
| char | MPI_CHAR |
| wchar | MPI_WCHAR |
{{< /table >}}

Specific to C:

{{< table >}}
| C type | MPI_Datatype |
|----------------|----------------|
| bool | MPI_C_BOOL |
| complex | MPI_C_COMPLEX |
| double complex | MPI_C_DOUBLE_COMPLEX |
{{< /table >}}

Specific to C++:

{{< table >}}
| C++ type | MPI_Datatype |
|----------------|----------------|
| bool | MPI_CXX_BOOL |
| complex | MPI_CXX_COMPLEX |
| double complex | MPI_CXX_DOUBLE_COMPLEX |
{{< /table >}}

Extensions

{{< table >}}
| C/C++ type | MPI_Datatype |
|----------------|----------------|
| none | MPI_BYTE |
| none | MPI_PACKED |
{{< /table >}}

#### Fortran

{{< table >}}
| Fortran type | MPI_Datatype |
|----------------|--------------------|
| integer | MPI_INTEGER |
| integer\*8 | MPI_INTEGER8 |
| real | MPI_REAL |
| double precision | MPI_DOUBLE_PRECISION |
| complex | MPI_COMPLEX |
| logical | MPI_LOGICAL |
| character | MPI_CHARACTER |
| none | MPI_BYTE |
| none | MPI_PACKED |
{{< /table >}}

Most MPI distributions support the following types. These are Fortran 77 style declarations; newer code should use `KIND` but care must be taken that the number of byes specified is correct.

{{< table >}}
| Fortran type | MPI_Datatype |
|----------------|--------------------|
| integer\*16 | MPI_INTEGER16 |
| real\*8 | MPI_REAL8 |
| real\*16 | MPI_REAL16 |
{{< /table >}}

#### Python

As we have mentioned, the basic MPI communication routines are in the Communicator class of the MPI subpackge of mpi4py. Each communication subprogram has two forms, a lower-case version and another where the first letter of the method is upper case. The lower-case version can be used to send or receive an object; mpi4py pickles it before communicating. The argument of these routines is the sent object; the received object is the return value of the function.

The upper-case version works _only_ with buffered objects, usually NumPy Ndarrays. Communicating Ndarrays is faster and is recommended when possible. However, _every_ buffer must be an Ndarray in this case, so even scalars must be placed into a one-element array. The upper-case buffered functions are more similar to the corresponding C/C++ functions. For the buffered functions, it is very important that the types match, so use of `dtype` is recommmended in declaring NumPy arrays.

The mpi4py package supports the C datatypes, in the format `MPI.Dtype` rather than `MPI_Dtype`, but they are seldom required as an argument to the MPI functions. It is strongly recommended that the type of each NumPy array be explicitly declared with the `dtype` option, to ensure that the types match in both send and receive buffers.

{{< code-download file="/courses/parallel-computing-introduction/codes/mpi4py_ex.py" lang="python" >}}

Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,7 @@ MPI stands for _M_ essage _P_ assing _I_ nterface. It is a standard establis

## Programming Languages

MPI is written in C and ships with bindings for Fortran. Bindings have been written for many other languages, including Python and R. C\+\+ programmers should use the C functions.

Many of the examples in this lecture are C or C\+\+ code, with some Fortran and Python examples as well. All of the C functions work the same for Fortran, with a slightly different syntax. They are mostly the same for Python but the most widely used set of Python bindings, `mpi4py`, was modeled on the deprecated C\+\+ bindings, as they are more "Pythonic."
MPI is written in C and ships with bindings for Fortran. Bindings have been written for many other languages, including Python and R. C\+\+ programmers should use the C functions. All of the C functions work the same for Fortran, with a slightly different syntax. They are mostly the same for Python but the most widely used set of Python bindings, `mpi4py`, was modeled on the deprecated C\+\+ bindings, as they are more "Pythonic."

Guides to the most-commonly used MPI routines for the three languages this course supports can be downloaded.

Expand All @@ -24,29 +22,34 @@ Guides to the most-commonly used MPI routines for the three languages this cours

[Python](/courses/parallel-computing-introduction/MPI_Guide_mpi4py.pdf)

## Process Management
## Processes and Messages

To MPI, a _process_ is a copy of your program's executable. MPI programs are run under the control of an executor or _process manager_. The process manager starts the requested number of processes on a specified list of hosts, assigns an identifier to each process, then starts the processes.

MPI programs are run under the control of an executor or _process manager_. The process manager starts the requested number of processes on a specified list of hosts, assigns an identifier to each process, then starts the processes. Each copy has its own global variables\, stack\, heap\, and program counter.
The most important point to understand about MPI is that each process runs _independently_ of all the others. Each process has its own global variables, stack, heap, and program counter. Any communications are through the MPI library. If one process is to carry out some instructions differently from the others, conditional statements must be inserted into the program to identify the process and isolate those instructions.

Usually when MPI is run the number of processes is determined and fixed for the lifetime of the program. The MPI3 standard can spawn new processes but in a resource managed environment such as a high-performance cluster, the total number must still be requested in advance.
Processes send and receive _messages_ from one another. A message is a stream of bytes containing the values of variables that one process needs to pass to or retrieve from another one.

Usually when MPI is run the number of processes is determined and fixed for the lifetime of the program. The MPI-3 standard can spawn new processes, but in a resource managed environment such as a high-performance cluster, the total number must still be requested in advance.

MPI distributions ship with a process manager called `mpiexec` or `mpirun`. In some environments, such as many using Slurm, we use the Slurm process manager `srun`.

When run outside of a resource-managed system, we must specify the number of processes through a command-line option. If more than one host is to be used, the name of a hostlist file must be provided, or only the local host will be utilized. The options may vary depending on the distribution of MPI but will be similar to that below:
```
mpiexec –np 16 -hosts compute1,compute2 ./myprog
```

When running with srun under Slurm the executor does _not_ require the `-np` flag; it computes the number of processes from the resource request. It is also aware of the hosts assigned by the scheduler.
```
srun ./myprog
```
### Message Envelopes

Just as a letter needs an envelope with a unambiguous address, a message needs to be uniquely identified. The _message envelope_ provides that identification. It consists of several components.

A _communicator_ is an object that specifies a group of processes that will communicate with one another. The default communicator is
`MPI_COMM_WORLD`. It includes all processes. The programmer can create new communicators, usually of subsets of the processes, but this is beyond our scope at this point.

In MPI the process ID is called the **rank**. Rank is relative to the communicator, and is numbered from zero. Process 0 is often called the *root process*.
In MPI the process ID is called the **rank**. Rank is relative to the communicator, and is numbered from zero.

A message is uniquely identified by its
- Source rank
Expand All @@ -56,11 +59,4 @@ A message is uniquely identified by its

The "tag" can often be set to an arbitrary value such as zero. It is needed only in cases where there may be multiple messages from the same source to the same destination in a short time interval, or a more complete envelope is desired for some reason.

### Message Buffers

MPI documentation refers to "send buffers" and "receive buffers." These refer to _variables_ in the program whose contents are to be sent or received. These variables must be set up by the programmer. The send and receive buffers cannot be the same unless the special "receive buffer" `MPI_IN_PLACE` is specified.

When a buffer is specified, the MPI library will look at the starting point in memory (the pointer to the variable). From other information in the command, it will compute the number of bytes to be sent or received. It will then set up a separate location in memory; this is the actual buffer. Often the buffer is not the same size as the original data since it is just used for streaming within the network. In any case, the application programmer need not be concerned about the details of the buffers and should just regard them as _variables_.

For the send buffer, MPI will copy the sequence of bytes into the buffer and send them over the appropriate network interface to the receiver. The receiver will acquire the stream of data into its receive buffer and copy them into the variable specified in the program.

0 comments on commit 2b4f01a

Please sign in to comment.