https://github.com/jimouris/curl
Curl: Private LLMs through Wavelet-Encoded Look-Up Tables
https://github.com/jimouris/curl
approximation dwt machine-learning mpc ppml privacy-preserving-machine-learning secure-multiparty-computation wavelet-transform wavelets
Last synced: about 1 month ago
JSON representation
Curl: Private LLMs through Wavelet-Encoded Look-Up Tables
- Host: GitHub
- URL: https://github.com/jimouris/curl
- Owner: jimouris
- License: mit
- Created: 2024-04-12T18:18:56.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-07T17:31:10.000Z (about 1 month ago)
- Last Synced: 2025-04-07T18:32:08.204Z (about 1 month ago)
- Topics: approximation, dwt, machine-learning, mpc, ppml, privacy-preserving-machine-learning, secure-multiparty-computation, wavelet-transform, wavelets
- Language: Python
- Homepage: https://eprint.iacr.org/2024/1127
- Size: 29 MB
- Stars: 13
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://github.com/jimouris/curl/blob/main/LICENSE)
[](https://github.com/jimouris/curl/blob/main/CONTRIBUTING.md)
--------------------------------------------------------------------------------Curl is a framework for Privacy Preserving Machine Learning (PPML) that builds on top of [CrypTen](https://github.com/facebookresearch/CrypTen) and [PyTorch](https://github.com/pytorch/pytorch).
CrypTen relies on expensive polynomial approximations for evaluating non linear functions such as logarithm, square root, etc.
In contrast, Curl uses lookup tables (LUTs) encoded with Discrete Wavelet Transforms (DWT) to approximate non-linearities that result in faster evaluation while achieving better approximations.This way, in Curl we are able to evaluate Large Language Models (LLMs) such as GPT-2, GPT Neo, BERT (tiny, base, large).
Curl's goal and model is similar to CrypTen:> Its goal is to make secure computing techniques accessible to Machine Learning practitioners.
> It currently implements [Secure Multiparty Computation](https://en.wikipedia.org/wiki/Secure_multi-party_computation)
> as its secure computing backend and offers three main benefits to ML researchers:
>
> 1. It is machine learning first. The framework presents the protocols via a `CrypTensor`
> object that looks and feels exactly like a PyTorch `Tensor`. This allows the user to use
> automatic differentiation and neural network modules akin to those in PyTorch.
>
> 2. CrypTen is library-based. It implements a tensor library just as PyTorch does.
> This makes it easier for practitioners to debug, experiment on, and explore ML models.
>
> 3. The framework is built with real-world challenges in mind. CrypTen does not scale back or
> oversimplify the implementation of the secure protocols.## How to cite this work
Curl will appear in the proceedings of the Conference on Applied Machine Learning in Information Security (CAMLIS) 2024.
The preprint can be accessed [here](https://eprint.iacr.org/2024/1127); you can cite this work as follows:
```bibtex
@InProceedings{CAMLIS:SMUJRSV24,
author = "Manuel B. Santos and
Dimitris Mouris and
Mehmet Ugurbil and
Stanislaw Jarecki and
José Reis and
Shubho Sengupta and
Miguel de Vega",
title = "{Curl: Private LLMs through Wavelet-Encoded Look-Up Tables}",
pages = {1--31},
booktitle = {Proceedings of the Conference on Applied Machine Learning in Information Security},
address = {Arlington, Virginia, USA},
month = {October 24--25,},
year = 2024,
}
```The original CrypTen paper can be accessed [here](https://arxiv.org/pdf/2109.00984.pdf) (documented [here](https://crypten.readthedocs.io/en/latest/)); you can cite this work as follows:
```bibtex
@InProceedings{crypten2020,
author={B. Knott and S. Venkataraman and A.Y. Hannun and S. Sengupta and M. Ibrahim and L.J.P. van der Maaten},
title={CrypTen: Secure Multi-Party Computation Meets Machine Learning},
booktitle={arXiv 2109.00984},
year={2021},
}
```## Installing CrypTen (Curl)
CrypTen currently runs on Linux and Mac with Python 3.7.
We also support computation on GPUs.
Windows **is not** supported.
To install Curl, follow the instructions in the [CONTRIBUTING.md](https://github.com/jimouris/curl/blob/main/CONTRIBUTING.md) file.## Examples
CrypTen has a series of tutorial built on Jupyter notebooks in the [tutorials directory](./tutorials/) as well as examples in the [examples directory](./examples/).
We extend these with our LLM applications in the [LLMs directory](./examples/llms/), which you can run as:
```shell
❯❯ python examples/llms/launcher.py --world_size 2 --tensor_size 1,10 --multiprocess --model GPT2
```To see the full list of arguments and LLMs available run the script with the `--help` flag:
```shell
❯❯ python examples/llms/launcher.py --help
```## Disclaimer
This is software for a research prototype and not production-ready code.
This repository builds upon [CrypTen](https://github.com/facebookresearch/CrypTen) and [PyTorch](https://github.com/pytorch/pytorch).