Skip to content

A simple API to launch Python functions to run on multiple ranked processes, mpify is designed to enable interactive multiprocessing experiments in Jupyter/IPython, such as distributed data parallel training over multiple GPUs.

License

Notifications You must be signed in to change notification settings

philtrade/mpify

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

mpify is a thin library to run function on multiple processes, and to help training model in PyTorch's Distributed Data Parallel mode within a Jupyter notebook running on a multi-GPU host.

The main features are:

  • Parallel but blocking execution, call returns after all processes finish.
  • Results from any or all of the workers can be gathered.
  • No persistent worker pool to maintain. Each subprocess spawns-executes-terminates.
  • Each spawned subprocess can create its own GPU context, suitable for multi-GPU host.
  • Functions and objects created in the Jupyter notebook, can be passed to the worker subprocesses by name.
  • import statements can be run in each subprocess before function execution.
  • Also works outside Jupyter, in batch Python app.

mpify is mainly for interactive multiprocessing where parameters and results can be passed between the Jupyter cell and worker subprocesses via simple function call semantics.

For asynchronous and/or remote workloads (on cluster), dask, ray, or ipyparallel are better choices, as they can manage persistent process pool, job scheduling, fault-tolerance etc..

Examples

  • Porting a training loop in Fastai2's notebook train in distributed data parallel mode:

Original:

mpify-ed:

More notebook examples can be found in the [examples/](/examples) directory.

Installation

python3 -m pip install git+https://github.com/philtrade/mpify

Documentation

The complete API documentation.

References

mpify was conceived to coordinate Python multiprocessing, Jupyter, and multiple CUDA GPUs on a single host.

About

A simple API to launch Python functions to run on multiple ranked processes, mpify is designed to enable interactive multiprocessing experiments in Jupyter/IPython, such as distributed data parallel training over multiple GPUs.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages