Releases
v0.8.0
[0.8.0] - 2021-09-26
Bug Fixes
Ci
Only run publish once on git tag
Core
Fix compressed buffer can not be scattered to odd number of ranks
Other
Fix ci pypi versioning
Remove init .py and python version , use cargo version
Move import bagua_install_library to install library function
Merge bagua_install_library and setup.py, remove nccl<=2.6 support
Fix alltoall_v parameter (#17 )
Reduce and allgather python interface
Fix decompress incorrect pointer and typo in error msg
Fix python gil deadlock during getting data ptr
Fix benchmark script requirements
Fix alltoall_v parameter types (#27 )
Always mark bagua padding tensor as ready
Make compress/decompress of BaguaTensor method
string consistent (#33 )
Fix scatter and reduce_scatter implementation (#40 )
Substract overflow error for decentralized op (#39 )
Fix QADAM params (#17 )
Fix assert precision (#18 )
Replace mutex with atomic bool for async op and add Aluminum submodule update (#67 )
Fix duplicated dependency downloading during installation (#77 )
Fix async algorithm aborting and hanging (#78 , #81 )
Fix qadam algorithm call (#20 )
Fix missing symbols in the zip library (#24 )
Fix random autotune server hang (#206 )
Bagua-net library path mismatch, make --enable_bagua_net
argument style consistent with other args (#218 )
Python
Fix random autotune-service hang
Handle conflicts caused by sklearn upgrade (#225 )
Features
Ci
Only publish pypi for master commits
Other
Add async model average algorithm (#110 )
Add cached dataset wrapper (#148 )
Support sync batchnorm (#151 )
Add --enable-bagua-net
option in launcher (#183 )
Add pytorch examples for MNIST, ImageNet, SQuAD training (#1 )
Add requirements.txt, only download dataset on local rank 0 (#2 )
Add python packaging related files
Add __version__
variable
Install nccl deps in bagua core and add generated __version__
variable
Add version.py placeholder to prevent file not found error
Initial support for python op (#2 )
Add 5 min timeout for buckets' comm op (#5 )
Replace NCCL with Aluminum (#7 )
Add synethetic benchmark script (#5 )
Add elastic training example (#7 )
Support alltoall_v (vector alltoall) (#14 )
Add reduce and allgather python interface
Support reduce and allgather op with Reduction op enum
Support creating BaguaTensor by passing torch tensor directly (#19 )
Compatible mode for getting pytorch tensor info with Python interpreter
Better debug log including tensor info when executing ops
Add native low precision decentralized operator (#26 )
Add (scatter, gather, scatter_reduce) and all inplace version communication primitives (#37 )
Make full precision decentralized op stateless (#36 )
Add communication_primitives example (#12 )
Use nccl 2.10 avg op for all algorithms using averaging (#46 , #45 )
Add opentelemetry to report tensor ready order (#42 )
Add deterministic flag (#15 )
Add native async model average algorithm (#41 )
Add examples for async model average algorithm (#14 )
Support packet splitting and multi-stream parallel transmission (#5 )
Support ncclnet v3 and remove the dependency on nccl in the installation environment (#17 )
Add sync interval param to async examples (#19 )
Suppport tokio backend (#21 )
Support bagua-net (#89 )
You can’t perform that action at this time.