{"id":18800572,"url":"https://github.com/xtra-computing/fedsim","last_synced_at":"2025-07-31T00:41:14.818Z","repository":{"id":86257179,"uuid":"433972726","full_name":"Xtra-Computing/FedSim","owner":"Xtra-Computing","description":"A coupled vertical federated learning framework that boosts the model performance with record similarities (NeurIPS 2022)","archived":false,"fork":false,"pushed_at":"2023-03-28T06:37:20.000Z","size":6658,"stargazers_count":26,"open_issues_count":0,"forks_count":5,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-13T17:49:58.513Z","etag":null,"topics":["federated-learning","pytorch","vertical-federated-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Xtra-Computing.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-01T20:20:21.000Z","updated_at":"2025-02-26T06:31:51.000Z","dependencies_parsed_at":null,"dependency_job_id":"7d4e04b6-5bfe-483e-b7c9-f2e404bbfee1","html_url":"https://github.com/Xtra-Computing/FedSim","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Xtra-Computing/FedSim","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xtra-Computing%2FFedSim","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xtra-Computing%2FFedSim/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xtra-Computing%2FFedSim/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xtra-Computing%2FFedSim/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Xtra-Computing","download_url":"https://codeload.github.com/Xtra-Computing/FedSim/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xtra-Computing%2FFedSim/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267967720,"owners_count":24173566,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-30T02:00:09.044Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["federated-learning","pytorch","vertical-federated-learning"],"created_at":"2024-11-07T22:19:08.503Z","updated_at":"2025-07-31T00:41:14.784Z","avatar_url":"https://github.com/Xtra-Computing.png","language":"Python","readme":"# FedSim \n[![GitHub license](https://img.shields.io/github/license/Xtra-Computing/FedSim)](https://github.com/Xtra-Computing/FedSim/edit/main/LICENSE)\n![PyTorch](https://img.shields.io/badge/torch-1.8.2-orange)\n\n\n\nFedSim is a **coupled vertical federated learning framework** that boosts the training with record similarities.\n\n\n## Requirements\n1. Install conda 4.14 following https://www.anaconda.com/products/distribution\n2. Clone this repo by\n```bash\ngit clone https://github.com/JerryLife/FedSim.git\n```\n3. Create environment (named `fedsim`) and install required basic modules.\n```bash\nconda env create -f environment.yml\nconda activate fedsim\n```\n4. Install `torch` and `torchvision` according to your cuda version with `pip`. For RTX 3090, we installed `torch==1.8.2` and `torchvision==0.9.2` as below.\n```bash\npip3 install torch==1.8.2 torchvision==0.9.2 --extra-index-url https://download.pytorch.org/whl/lts/1.8/cu111\n``` \n5. Ensure all the required folders are created (which should exist upon git clone).\n```bash\nmkdir -p runs ckp log cache\n```\n## Datasets\nIn this repo, due to the size limit, we include two datasets `house` and `game` in the `data/` folder.\n```\ndata\n├── beijing \t\t\t\t(house)\n│   ├── airbnb_clean.csv\t(Secondary)\n│   └── house_clean.csv\t\t(Primary)\n└── hdb\t\t\t\t\t\t(hdb)\n    ├── hdb_clean.csv\t\t(Primary)\n    └── school_clean.csv\t(Secondary)\n```\n## Linkage and Training\nThe linkage and training of each dataset is combined in a single script.\n### FedSim without adding noise\nThe scripts without adding noise are located under `src/` in the format of `src/train_\u003cdataset\u003e_\u003calgorithm\u003e.py`. You can run each script by\n\n\n\u003e python src/train_\u003cdataset\u003e_\u003calgorithm\u003e.py [-g gpu_index] [-p perturbed_noise_on_similarity] [-k number_of_neighbors] [--mlp-merge] [-ds] [-dw]\n\n* `-g/--gpu`: GPU index to run this script. If GPU of this index is not available, CPU will be used instead.\n* `-k/--top-k`: Number of neighbors to extract from possible matches, which should be less than the value of \"knn_k\". ($K$ in the paper)\n* `-p/--leak-p`: The probability of leakage of bloom filters. ($\\tau$ in the paper)\n* `--mlp-merge`: whether to replace CNN merge model with MLP merge model\n* `-ds/--disable-sort`: whether to distable the sort gate\n* `-dw/--disable-weight`: whether to disable the weight gate\n\nTaking house dataset dataset as an example:\n```bash\npython src/train_beijing_fedsim.py -g 1 -p 1e0 -k 5 -ds\n```\nruns FedSim on house dataset with $\\tau=1$ (no added noise), $K=5$, merging with CNN, disabling sort gate, enabling weight gate.\n\n### FedSim with noise added\nThe scripts with adding noise are located in `src/priv_scripts` in the same format as the scripts without noise. The only difference are some hyperparamter settings. You may run these scripts by similar command. For example,\n```bash\npython src/train_beijing_fedsim.py -g 1 -p 1e-2 -k 5 -ds\n```\nruns FedSim on house dataset with noise satisfying $\\tau=0.01$ added, $K=5$, merging with CNN, disabling sort gate, enabling weight gate.\n\n## Citation\n```bib\n@inproceedings{NEURIPS2022_84b74416,\n author = {Wu, Zhaomin and Li, Qinbin and He, Bingsheng},\n booktitle = {Advances in Neural Information Processing Systems},\n editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh},\n pages = {21087--21100},\n publisher = {Curran Associates, Inc.},\n title = {A Coupled Design of Exploiting Record Similarity for Practical Vertical Federated Learning},\n url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/84b744165a0597360caad96b06e69313-Paper-Conference.pdf},\n volume = {35},\n year = {2022}\n}\n```\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxtra-computing%2Ffedsim","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxtra-computing%2Ffedsim","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxtra-computing%2Ffedsim/lists"}