Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ChrisWaites/pyvacy
Differentially Private Optimization for PyTorch 👁🙅♀️
https://github.com/ChrisWaites/pyvacy
differential-privacy pytorch
Last synced: 3 months ago
JSON representation
Differentially Private Optimization for PyTorch 👁🙅♀️
- Host: GitHub
- URL: https://github.com/ChrisWaites/pyvacy
- Owner: ChrisWaites
- License: apache-2.0
- Created: 2019-03-22T22:05:42.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2020-05-26T15:29:38.000Z (over 4 years ago)
- Last Synced: 2024-03-15T04:11:49.611Z (11 months ago)
- Topics: differential-privacy, pytorch
- Language: Python
- Homepage:
- Size: 1.38 MB
- Stars: 179
- Watchers: 6
- Forks: 29
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- AwesomeResponsibleAI - PyVacy: Privacy Algorithms for PyTorch
README
PyVacy: Privacy Algorithms for PyTorch
Basically [TensorFlow Privacy](https://github.com/tensorflow/privacy), but for PyTorch.
DP-SGD implementation modeled after techniques presented within [Deep Learning with Differential Privacy](https://arxiv.org/abs/1607.00133) and [A General Approach to Adding Differential Privacy to Iterative Training Procedures](https://arxiv.org/abs/1812.06210).
## Example Usage
```python
from pyvacy import optim, analysis, samplingtraining_parameters = {
'N': len(train_dataset),
# An upper bound on the L2 norm of each gradient update.
# A good rule of thumb is to use the median of the L2 norms observed
# throughout a non-private training loop.
'l2_norm_clip': 1.0,
# A coefficient used to scale the standard deviation of the noise applied to gradients.
'noise_multiplier': 1.1,
# Each example is given probability of being selected with minibatch_size / N.
# Hence this value is only the expected size of each minibatch, not the actual.
'minibatch_size': 128,
# Each minibatch is partitioned into distinct groups of this size.
# The smaller this value, the less noise that needs to be applied to achieve
# the same privacy, and likely faster convergence. Although this will increase the runtime.
'microbatch_size': 1,
# The usual privacy parameter for (ε,δ)-Differential Privacy.
# A generic selection for this value is 1/(N^1.1), but it's very application dependent.
'delta': 1e-5,
# The number of minibatches to process in the training loop.
'iterations': 15000,
}model = nn.Sequential(...)
optimizer = optim.DPSGD(params=model.parameters(), **training_parameters)
epsilon = analysis.epsilon(**training_parameters)
loss_function = ...minibatch_loader, microbatch_loader = sampling.get_data_loaders(**training_parameters)
for X_minibatch, y_minibatch in minibatch_loader(train_dataset):
optimizer.zero_grad()
for X_microbatch, y_microbatch in microbatch_loader(TensorDataset(X_minibatch, y_minibatch)):
optimizer.zero_microbatch_grad()
loss = loss_function(model(X_microbatch), y_microbatch)
loss.backward()
optimizer.microbatch_step()
optimizer.step()
```## Tutorials
`mnist.py`
Implements a basic classifier for identifying which digit a given MNIST image corresponds to. The model achieves a test set classification accuracy of 96.7%. The architecture and results achieved are inspired by the corresponding tutorial within [TensorFlow privacy](https://github.com/tensorflow/privacy/tree/master/tutorials).
## Disclaimer
Do NOT use the contents of this repository in applications which handle sensitive data. The author accepts no liability for privacy infringements - use the contents of this repository solely at your own discretion.