Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/khoaguin/split-learning-1d-he
Privacy-preserving Training of a Split 1D CNN Neural Network on Homomorphic Encrypted ECG Data to Detect Heart Diseases
https://github.com/khoaguin/split-learning-1d-he
Last synced: 11 days ago
JSON representation
Privacy-preserving Training of a Split 1D CNN Neural Network on Homomorphic Encrypted ECG Data to Detect Heart Diseases
- Host: GitHub
- URL: https://github.com/khoaguin/split-learning-1d-he
- Owner: khoaguin
- Created: 2021-09-14T09:30:46.000Z (about 3 years ago)
- Default Branch: master
- Last Pushed: 2023-12-31T15:27:32.000Z (10 months ago)
- Last Synced: 2024-10-07T12:22:35.780Z (about 1 month ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 65.4 MB
- Stars: 4
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Split Learning HE
![](./images/U-shapedSL.png)
Source code for the paper [Love or Hate? Share or Split? Privacy-Preserving Training Using Split Learning and Homomorphic Encryption (accepted at 20th Annual International Conference on Privacy, Security & Trust (PST'23))](https://arxiv.org/pdf/2309.10517.pdf)Split learning involves 2 parties (a client and a server) that collaboratively train a model. The client keeps the data on his side, trains his part of the model to produce the activation maps, then sends those activation maps to the server. The server subsequently continues the training process. This way, the client never needs to send his data to the server. However, the activation maps can still leak information about the client's data. In this project, we train a split learning 1D CNN model on homomorphic encrypted activation maps to solve this privacy leakage.
### Requirements
Essentially, we only need these 2 main libraries:
`torch==1.10.0+cu102`
`tenseal==0.3.10`
More detailed requirements are in the file `requirements.txt`.
### Repository Structure* `data/`
* `train_ecg.hdf5` - the processed training split from the [MIT-DB](https://physionet.org/content/mitdb/1.0.0/) dataset
* `test_ecg.hdf5` - the processed testing split from the [MIT-DB](https://physionet.org/content/mitdb/1.0.0/) dataset
* `ptbxl_processing.ipynb` - code needed to process the [PTB-XL](https://physionet.org/content/ptb-xl/1.0.1/) dataset. Running the code will output `train_ptbxl.hdf5` and `test_ptbxl.hdf5`
* `local_plaintext/`
* `train.ipynb` - code to train the 1D CNN locally on the MIT-DB dataset
* `visual_invertibility.ipynb` - code to demonstrate the privacy leakage of the activation maps produced by the convolutional layers on the MIT-DB dataset
* `train_ptbxl.ipynb` - code to train the 1D CNN locally on the PTB-XL dataset
* `visual_invertibility_ptbxl.ipynb` - similar to `visual_invertibility.ipynb` but for the PTB-XL dataset
* `local_plaintext_big`
* `train.ipynb` - code to train the 1D CNN model on the MIT-DB dataset but with bigger activation maps.
* `visual_invertibility.ipynb` - demonstrate the privacy leakage
* `u_shaped_split_he`
* `client.py` and `server.py`: code for the client side and server side to train the split learning protocol using homomorphically encrypted activation maps on the MIT-DB dataset
* `client_ptbxl.py` and `server_ptbxl.py`: similarly, but for the PTB-XL dataset
* `u_shaped_split_he_big`
* `client.py` and `server.py`: code to train the split 1D CNN using HE with bigger activation maps size, only for the MIT-DB dataset
* `u_shaped_split_plaintext`
* `client.py` and `server.py`: code for the client side and server side to train the split learning protocol on plaintext activation maps for the MIT-DB dataset
* `client_ptbxl.py` and `server_ptbxl.py`: similarly, but for the PTB-XL dataset
* `u_shaped_split_plaintext_big`
* `client.py` and `server.py`: code for the client and the server to train the split learning protocol on plaintext activation maps with bigger size, only for the MIT-DB dataset### Running the code
Make sure you have the data files needed in the `data/` directory (`train_ecg.hdf5` and `test_ecg.hdf5` for the MIT-DB dataset, and `train_ptbxl.hdf5` and `test_ptbxl.hdf5` for the PTB-XL dataset).
To run the code, simply `cd` into the directory and run the code for server side and client side. Note that you need to run the code for server side first. For example, if you want to run the u-shaped split learning using HE for the PTB-XL dataset, do the following:
```
cd u_shaped_split_he
python server_ptbxl.py
```
Then, open a new tab and run
```
python client_ptbxl.py
```