Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update on "Configure RNGs appropriately for Pipeline + SPMD"
DTensor has existing RNG management. It requires a shared seed for every rank in its 'world' (SPMD world). Then it manages offsets per rank using its own RNG tracker to ensure same or different random values across ranks depending on the device-mesh and the type of sharding on the current operation being performed. (TODO: link to docs) When used together with pipeline parallelism, it is important to use a different seed for each separate SPMD world. E.g. if the user specified seed 1234, then we can literally use 1234 for all the ranks on PP=0, but then we should use a different seed (e.g. 1234 + 1) for ranks on PP=1. This partitions the world into PP separate SPMD worlds and uses a unique seed for each SPMD world. Control 'deterministic' mode separately from rng seed The use case for 'deterministic' mode may be more for debugging, while users may want to control RNG seeds used for real runs. [ghstack-poisoned]
- Loading branch information