https://github.com/opendp/smartnoise-sdk
Tools and service for differentially private processing of tabular and relational data
https://github.com/opendp/smartnoise-sdk
differential-privacy opendp privacy smartnoise
Last synced: 6 months ago
JSON representation
Tools and service for differentially private processing of tabular and relational data
- Host: GitHub
- URL: https://github.com/opendp/smartnoise-sdk
- Owner: opendp
- License: mit
- Created: 2019-09-23T18:06:00.000Z (almost 7 years ago)
- Default Branch: main
- Last Pushed: 2025-08-12T20:14:48.000Z (11 months ago)
- Last Synced: 2025-12-19T01:31:10.616Z (6 months ago)
- Topics: differential-privacy, opendp, privacy, smartnoise
- Language: Python
- Homepage:
- Size: 93.3 MB
- Stars: 288
- Watchers: 23
- Forks: 76
- Open Issues: 32
-
Metadata Files:
- Readme: README.md
- Contributing: contributing.rst
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
- awesome-ai-security - OpenDP SmartNoise
- Pop_OS-Guide - Smart Noise - of-the-art differential privacy (DP) techniques to inject noise into data, to prevent disclosure of sensitive information and manage exposure risk. (Tools / Winetricks)
- awesome-synthetic-data - OpenDP SmartNoise
README
[](https://opensource.org/licenses/MIT)
# SmartNoise SDK: Tools for Differential Privacy on Tabular Data
The SmartNoise SDK includes 2 packages:
* [smartnoise-sql](sql/): Run differentially private SQL queries
* [smartnoise-synth](synth/): Generate differentially private synthetic data
To get started, see the examples below. Click into each project for more detailed examples.
## SQL
[](https://www.python.org/)
### Install
```bash
pip install smartnoise-sql
```
### Query
```python
import snsql
from snsql import Privacy
import pandas as pd
csv_path = 'PUMS.csv'
meta_path = 'PUMS.yaml'
data = pd.read_csv(csv_path)
privacy = Privacy(epsilon=1.0, delta=0.01)
reader = snsql.from_connection(data, privacy=privacy, metadata=meta_path)
result = reader.execute('SELECT sex, AVG(age) AS age FROM PUMS.PUMS GROUP BY sex')
print(result)
```
`PUMS.csv` and `PUMS.yaml` can be found in the [datasets](datasets/) folder.
See the [SQL project](sql/README.md)
## Synthesizers
[](https://www.python.org/)
### Install
```
pip install smartnoise-synth
```
### MWEM
```python
import pandas as pd
import numpy as np
pums = pd.read_csv(pums_csv_path, index_col=None) # in datasets/
pums = pums.drop(['income'], axis=1)
nf = pums.to_numpy().astype(int)
synth = snsynth.MWEMSynthesizer(epsilon=1.0, split_factor=nf.shape[1])
synth.fit(nf)
sample = synth.sample(10) # get 10 synthetic rows
print(sample)
```
### PATE-CTGAN
```python
import pandas as pd
import numpy as np
from snsynth.pytorch.nn import PATECTGAN
from snsynth.pytorch import PytorchDPSynthesizer
pums = pd.read_csv(pums_csv_path, index_col=None) # in datasets/
pums = pums.drop(['income'], axis=1)
synth = PytorchDPSynthesizer(1.0, PATECTGAN(regularization='dragan'), None)
synth.fit(pums, categorical_columns=pums.columns.values.tolist())
sample = synth.sample(10) # synthesize 10 rows
print(sample)
```
See the [Synthesizers project](synth/README.md)
## Communication
- You are encouraged to join us on [GitHub Discussions](https://github.com/opendp/opendp/discussions/categories/smartnoise)
- Please use [GitHub Issues](https://github.com/opendp/smartnoise-sdk/issues) for bug reports and feature requests.
- For other requests, including security issues, please contact us at [smartnoise@opendp.org](mailto:smartnoise@opendp.org).
## Releases and Contributing
Please let us know if you encounter a bug by [creating an issue](https://github.com/opendp/smartnoise-sdk/issues).
We appreciate all contributions. Please review the [contributors guide](contributing.rst). We welcome pull requests with bug-fixes without prior discussion.
If you plan to contribute new features, utility functions or extensions to this system, please first open an issue and discuss the feature with us.