Releases: pytorch/opacus
Opacus v1.5.2
New features
Add a function of "double_backward" simplifying the training loop (#661)
Bug fixes
Fix issue with setting of param_group for the DPOptimizer wrapper (issue 649) (#660)
Fix issue of DDP optimizer for FGC. The step function incorrectly called "original_optimizer.original_optimizer" (#662)
Replace "opt_einsum.contract" by "torch.einsum"(#663)
Opacus v1.5.1
Bug fixes
Make the import of opt_einsum.contract (linear.py) explicit (#658).
Opacus v1.5
New features
Fast Gradient Clipping and Ghost Clipping (#656)
Bug fixes
Fix gradient shape error for DPMultiheadAttention (issue 650) (#651)
Pass kwargs from make_private to _prepare_optimizer (#648)
Fix BatchMemoryManager length (#641)
Fix GPU-CPU device mismatch error in util filter_dilated_rows (#633)
Fix Opacus's runtime error with an empty batch (issue 612) (#631)
Opacus v1.4.1
Opacus v1.4.0
Highlight: Upgraded to PyTorch 1.13+ as required dependency
New features
Bug fixes
Opacus v1.3
New features
- Implement the
PRVAccountant
based on the paper Numerical Composition of Differential Privacy (#493) - Support
nn.EmbeddingBag
(#519)
Bug fixes
- Fix benchmarks (#503, #507, #508)
- Align
make_private_with_epsilon
withmake_private
(#509, #526) - Test fixes (#513, #515, #527, #533)
- Summed discriminator losses to perform one backprop step (#474)
- Fixed issue with missing argument in MNIST example (#520)
- Functorch gradients: investigation and fix (#510)
- Support empty batches (#530)
Opacus v1.2
We're glad to present Opacus v1.2, which contains some major updates to per sample gradient computation mechanisms and includes all the good stuff from the recent PyTorch releases.
Highlights
Functorch - per sample gradients for all
With the recent release of functorch it's now easy to compute per sample gradients for any module, without any limitations we've had to set before.
Here's the new default behaviour:
- First, we check if the input module contains any layers known to be incompatible with the DP-SGD (e.g. BatchNorm). Note, that these restrictions are fundamental to how DP-SGD works and will always be revelant
- Then, for each layer we select a method of computing per sample gradients. For performance reasons, we still use old manually written grad samplers for the layers we support and fall back to the generic functorch-based grad sampler for all other layers.
You can also force functorch-based grad sampler for every layer by passing grad_sample_mode="functorch"
to PrivacyEngine.make_private()
or force_functorch=False
to GradSampleModule
's constructor.
If you're using functorch for your training pipeline already, consider using GradSampleModuleNoOp
(grad_sample_mode="no_op"
) . As suggested by the name, is performs no action and expects client to compute per sample gradients themselves. See our CIFAR-10 example for code demonstration.
Note, that this functionality is still in beta and we haven't fully explored it's limitations. Please report any weird behaviour or inconsistencies you encounter to out github issues, we greatly appreciate the feedback.
ExpandedWeights - yet another way to compute per sample gradients
One more exciting feature now available in core PyTorch is ExpandedWeights
. This feature uses old Opacus' approach of manually-written vectorized per sample gradient computations, but achieves much better performance.
To activate ExpandedWeights
pass grad_sample_mode="ew"
to PrivacyEngine.make_private()
or use GradSampleModuleExpandedWeights
Summary: 3 different ways to compute per sample gradients
With the recent updates, Opacus now supports 3 different ways to compute per sample gradients. Below is the quick comparison. For more details refer to the grad sample README.md
TL;DR: If you want stable implementation, use GradSampleModule
(grad_sample_mode="hooks"
).
If you want to experiment with the new functionality, you have two options. Try
GradSampleModuleExpandedWeights
(grad_sample_mode="ew"
) for better performance and grad_sample_mode=functorch
if your model is not supported by GradSampleModule
.
Please switch back to GradSampleModule
(grad_sample_mode="hooks"
) if you encounter strange errors or unexpexted behaviour.
We'd also appreciate it if you report these to us
xxx | Hooks | Expanded Weights | Functorch |
---|---|---|---|
Required PyTorch version | 1.8+ | 1.13+ | 1.12 (to be updated) |
Development status | Underlying mechanism deprecated | Beta | Beta |
Runtime Performanceβ | baseline | β ~25% faster | π¨ 0-50% slower |
Any DP-allowedβ β layers | Not supported | Not supported | β Supported |
Most popular nn.* layers | β Supported | β Supported | β Supported |
torchscripted models | Not supported | β Supported | Not supported |
Client-provided grad sampler | β Supported | Not supported | β Not needed |
batch_first=False |
β Supported | Not supported | β Supported |
Recurrent networks | β Supported | Not supported | β Supported |
Padding same in Conv |
β Supported | Not supported | β Supported |
β Note, that performance differences are unstable and can vary a lot depending on the exact model and batch size.
Numbers above are averaged over benchmarks with small models consisting of convolutional and linear layers.
Note, that performance differences are only observed on GPU training, CPU performance seem to be almost identical
for all approaches.
β β Layers that produce joint computations on batch samples (e.g. BatchNorm) are not allowed under any approach
Other improvements
- Fix
utils.unfold2d
with non-symmetric pad/dilation/kernel_size/stride (#443) - Add support for "same" and "valid" padding for hooks-based grad sampler for convolution layers
- Improve model validation to support frozen layers and catch copied parameters (#489)
- Remove annoying logging from
set_to_none
(#471) - Improved documentation (#480, #478, #482, #485, #486, #487, #488)
- Imtegration test improvements (#407, #479, #481. #473)
Opacus v1.1.3
Opacus v1.1.2
Opacus v1.1.1
Bug fixes
- Fix accountant when using number of steps instead of epochs
- Add params check when converting BatchNorm to GroupNorm (#390)
- Fix typo in gdp accountant mechansim name (#386)
- Fix linter errors (#392)
- Add friendly and detailed message for unsupported layers (#401)
- Run linter on nightly workflow (#399)
- Add warning for Gaussian DP accounting (#400)
- Clone replacement modules on the same device as original (#356)
- Implementing 3D dilation (#408)
- fix(batch_memory_manager): Ensures split_idxs use native python types (#410)