{"id":13689820,"url":"https://github.com/SMILELab-FL/FedLab","last_synced_at":"2025-05-02T06:31:34.256Z","repository":{"id":38259052,"uuid":"346206756","full_name":"SMILELab-FL/FedLab","owner":"SMILELab-FL","description":"A flexible Federated Learning Framework based on PyTorch, simplifying your Federated Learning research.","archived":false,"fork":false,"pushed_at":"2024-08-12T06:11:08.000Z","size":89725,"stargazers_count":760,"open_issues_count":7,"forks_count":129,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-04-09T06:02:04.024Z","etag":null,"topics":["deep-learning","federated-learning","federated-learning-framework","fedlab","machine-learning","pytorch","pytorch-federated-learning"],"latest_commit_sha":null,"homepage":"https://fedlab.readthedocs.io","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SMILELab-FL.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-10T02:27:08.000Z","updated_at":"2025-04-09T04:14:19.000Z","dependencies_parsed_at":"2024-06-19T11:23:40.333Z","dependency_job_id":"360eac16-6852-4120-b0d5-6f55f5fe72f5","html_url":"https://github.com/SMILELab-FL/FedLab","commit_stats":{"total_commits":941,"total_committers":7,"mean_commits":"134.42857142857142","dds":"0.49202975557917106","last_synced_commit":"02f2749c704792d7a6f8ff9ae585604486cc04a0"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SMILELab-FL%2FFedLab","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SMILELab-FL%2FFedLab/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SMILELab-FL%2FFedLab/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SMILELab-FL%2FFedLab/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SMILELab-FL","download_url":"https://codeload.github.com/SMILELab-FL/FedLab/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251998467,"owners_count":21677987,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","federated-learning","federated-learning-framework","fedlab","machine-learning","pytorch","pytorch-federated-learning"],"created_at":"2024-08-02T16:00:27.937Z","updated_at":"2025-05-02T06:31:29.366Z","avatar_url":"https://github.com/SMILELab-FL.png","language":"Jupyter Notebook","funding_links":[],"categories":["Open-source FL Framework","Libraries(Which support Asynchronous Federated Learning)","federated learning framework","Framework"],"sub_categories":["Vertical FL","2018","table","2022"],"readme":"\u003cp align=\"center\"\u003e\u003cimg src=\"./docs/imgs/FedLab-logo.svg?raw=True\" width=700\u003e\u003c/p\u003e\n\n# FedLab: A Flexible Federated Learning Framework\n\n[![GH Actions Tests](https://github.com/SMILELab-FL/FedLab/actions/workflows/CI.yml/badge.svg)](https://github.com/SMILELab-FL/FedLab/actions) [![Documentation Status](https://readthedocs.org/projects/fedlab/badge/?version=master)](https://fedlab.readthedocs.io/en/master/?badge=master) [![License](https://img.shields.io/github/license/SMILELab-FL/FedLab)](https://opensource.org/licenses/Apache-2.0) [![codecov](https://codecov.io/gh/SMILELab-FL/FedLab/branch/master/graph/badge.svg?token=4HHB5JCSC6)](https://codecov.io/gh/SMILELab-FL/FedLab) [![arXiv](https://img.shields.io/badge/arXiv-2107.11621-red.svg)](https://arxiv.org/abs/2107.11621) [![Pyversions](https://img.shields.io/pypi/pyversions/fedlab.svg?style=flat-square)](https://pypi.python.org/pypi/fedlab)\n\n\nFederated learning (FL), proposed by Google at the very beginning, is recently a burgeoning research area of machine learning, which aims to protect individual data privacy in the distributed machine learning processes, especially in ﬁnance, smart healthcare, and edge computing. Different from traditional data-centered distributed machine learning, participants in the FL setting utilize localized data to train local models, then leverages speciﬁc strategies with other participants to acquire the ﬁnal model collaboratively, avoiding direct data-sharing behavior.\n\nTo relieve the burden of researchers in implementing FL algorithms and emancipate FL scientists from the repetitive implementation of basic FL settings, we introduce a highly customizable framework __FedLab__ in this work. __FedLab__ provides the necessary modules for FL simulation, including ***communication***, ***compression***, ***model optimization***, ***data partition*** and other ***functional modules***. Users can build an FL simulation environment with custom modules like playing with LEGO bricks. For better understanding and easy usage, the FL baseline algorithms implemented via __FedLab__ are also presented.\n\n\n## Quick start\n\n### Install\n\n- Install the latest version from source code:\n```\n$ git clone git@github.com:SMILELab-FL/FedLab.git\n$ cd FedLab\n$ pip install -r requirements.txt\n```\n\n- Install the stable version (old version) via pip:\n```\n# assign the version fedlab==1.3.0\n$ pip install fedlab \n```\n\n### Learning materials\n\nWe provide tutorials in jupyter notebook format for FedLab beginners in FedLab\\tutorials. These tutorials include data partition, customized algorithms, and pipeline demos. For the FedLab or FL beginners, we recommend this [notebook](tutorials/pipeline_tutorial.ipynb). Furthermore, we provide reproductions of federated algorithms via FedLab, which are stored in fedlab.contirb.algorithm. We think they are good examples for users to further explore FedLab.\n\n[Website Documentations](https://fedlab.readthedocs.io/en/master/) are available:\n\n- [Overview of FedLab](https://fedlab.readthedocs.io/en/master/overview.html)\n- [Installation \u0026 Setup](https://fedlab.readthedocs.io/en/master/install.html)\n- [Examples](https://fedlab.readthedocs.io/en/master/example.html)\n- [Contribute Guideline](https://fedlab.readthedocs.io/en/master/contributing.html)\n- [API Reference](https://fedlab.readthedocs.io/en/master/autoapi/index.html)\n\n\n### Run Examples\n\n- Run our quick start examples of different scenarios with a partitioned MNIST dataset.\n\n```\n# example of standalone\n$ cd ./examples/standalone/\n$ python standalone.py --total_clients 100 --com_round 3 --sample_ratio 0.1 --batch_size 100 --epochs 5 --lr 0.02\n```\n\n## Architecture\nFiles architecture of FedLab. These contents may be helpful for users to understand our repo.\n\n```\n├── fedlab\n│   ├── contrib\n│   ├── core\n│   ├── models\n│   └── utils\n├── datasets\n│   └── ...\n├── examples\n│   ├── asynchronous-cross-process-mnist\n│   ├── cross-process-mnist\n│   ├── hierarchical-hybrid-mnist\n│   ├── network-connection-checker\n│   ├── scale-mnist\n│   └── standalone-mnist\n└── tutorials\n    ├── communication_tutorial.ipynb\n    ├── customize_tutorial.ipynb\n    ├── pipeline_tutorial.ipynb\n    └── ...\n```\n\n## Baselines\n\nWe provide the reproduction of baseline federated algorthms for users in this repo.\n\n| Method              | Type   | Paper                                                        | Publication  | Official code                                        |\n| ------------------- | ------ | ------------------------------------------------------------ | ------------ | ---------------------------------------------------- |\n| FedAvg              | Optim. | [Communication-Efficient Learning of Deep Networks from Decentralized Data](http://proceedings.mlr.press/v54/mcmahan17a/mcmahan17a.pdf) | AISTATS'2017 |                                                      |\n| FedProx             | Optim. | [Federated Optimization in Heterogeneous Networks](https://arxiv.org/abs/1812.06127) | MLSys' 2020  | [Code](https://github.com/litian96/FedProx)          |\n| FedDyn              | Optim. | [Federated Learning Based on Dynamic Regularization](https://openreview.net/forum?id=B7v4QMR6Z9w) | ICLR' 2021   | [Code](https://github.com/alpemreacar/FedDyn)        |\n| q-FFL               | Optim. | [Fair Resource Allocation in Federated Learning](https://arxiv.org/abs/1905.10497) | ICLR' 2020   | [Code](https://github.com/litian96/fair_flearn)      |\n| FedNova             | Optim. | [Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization](https://proceedings.neurips.cc/paper/2020/hash/564127c03caab942e503ee6f810f54fd-Abstract.html) | NeurIPS'2020 | [Code](https://github.com/JYWa/FedNova)              |\n| IFCA                | Optim. | [An Efficient Framework for Clustered Federated Learning](https://proceedings.neurips.cc/paper/2020/hash/e32cc80bf07915058ce90722ee17bb71-Abstract.html) | NeurIPS'2020 | [Code](https://github.com/jichan3751/ifca)           |\n| Ditto               | Optim. | [Ditto: Fair and Robust Federated Learning Through Personalization]() | ICML'2021    | [Code](https://github.com/litian96/ditto)            |\n| SCAFFOLD            | Optim. | [SCAFFOLD: Stochastic Controlled Averaging for Federated Learning]() | ICML'2020    ||\n| Personalized-FedAvg | Optim. | [Improving Federated Learning Personalization via Model Agnostic Meta Learning](https://arxiv.org/pdf/1909.12488.pdf) |    Pre-print      |                                                      |\n| CFL                 | Optim.  | [Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints](https://arxiv.org/abs/1910.01991) | IEEE'2020    | [Code](https://github.com/felisat/clustered-federated-learning#clustered-federated-learning-model-agnostic-distributed-multi-task-optimization-under-privacy-constraints)        |\n| Power-of-choice     |  Misc. | [Client Selection in Federated Learning: Convergence Analysis and Power-of-Choice Selection Strategies](https://arxiv.org/abs/2010.01243) | AISTATS'2021    |                                                      |\n| QSGD                | Com.   | [QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding](https://proceedings.neurips.cc/paper/2017/hash/6c340f25839e6acdc73414517203f5f0-Abstract.html) | NeurIPS'2017 |                                                      |\n| NIID-Bench          | Data.  | [Federated Learning on Non-IID Data Silos: An Experimental Study](https://arxiv.org/abs/2102.02079) | ICDE' 2022 | [Code](https://github.com/Xtra-Computing/NIID-Bench) |\n| LEAF                | Data.  | [LEAF: A Benchmark for Federated Settings](http://arxiv.org/abs/1812.01097) | Pre-print    | [Code](https://github.com/TalwalkarLab/leaf/)        |\n| ...                |   |  |     |      |\n## Datasets \u0026 Data Partition\n\nSophisticated in the real world, FL needs to handle various kind of data distribution scenarios, including iid and non-iid scenarios. Though there already exists some datasets and partition schemes for published data benchmark, it still can be very messy and hard for researchers to partition datasets according to their specific research problems, and maintain partition results during simulation. __FedLab__ provides [`fedlab.utils.dataset.partition.DataPartitioner`](https://fedlab.readthedocs.io/en/master/autoapi/fedlab/utils/dataset/partition/index.html#fedlab.utils.dataset.partition.DataPartitioner) that allows you to use pre-partitioned datasets as well as your own data. `DataPartitioner` stores sample indices for each client given a data partition scheme. Also, FedLab provides some extra datasets that are used in current FL researches while not provided by official PyTorch `torchvision.datasets` yet.\n\n### Data Partition\n\nWe provide multiple data partition schemes used in recent FL papers[[1]](#1)[[2]](#2)[[3]](#3). Here we show the data partition visualization of several common used datasets as the examples.\n\n#### 1. Balanced IID partition\n\nEach client has same number of samples, and same distribution for all class samples. \n\nGiven 100 clients and CIFAR10, the data samples assigned to the first 10 clients could be:\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/cifar10_balance_iid_100clients.png\" height=\"200\"\u003e\u003c/p\u003e\n\n#### 2. Unbalanced IID partition\n\nAssign different sample number for each client using Log-Normal distribution $\\text{Log-N}(0, \\sigma^2)$, while keep same distribution for different class samples. \n\nGiven $\\sigma=0.3$, 100 clients and CIFAR10, the data samples assigned to the first 10 clients is showed left below. And distribution of sample number for clients is showed right below.\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/cifar10_unbalance_iid_unbalance_sgm_0.3_100clients.png\" height=\"200\"\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp;\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/cifar10_unbalance_iid_unbalance_sgm_0.3_100clients_dist.png\" height=\"200\"\u003e\u003c/p\u003e\n\n#### 3. Hetero Dirichlet partition\n\nNon-iid partition used in [[3]](#3) and [[6]](#6). Number of data points and class proportions are unbalanced. Samples will be partitioned into $J$ clients by sampling $p_k∼\\text{Dir}_J(\\alpha)$ and allocating a $p_{k,j}$ proportion of the samples of class $k$ to local client $j$.\n\nGiven 100 clients, $\\alpha=0.3$ and CIFAR10, the data samples assigned to the first 10 clients is showed left below. And distribution of sample number for clients is showed right below.\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/cifar10_hetero_dir_0.3_100clients.png\" height=\"200\"\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp;\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/cifar10_hetero_dir_0.3_100clients_dist.png\" height=\"200\"\u003e\u003c/p\u003e\n\n\n\n#### 4. Shards partition\n\nNon-iid partition based on shards, used in [[4]](#4).\n\nGiven `shard_number=200`, 100 clients and CIFAR10, the data samples assigned to the first 10 clients could be:\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/cifar10_shards_200_100clients.png\" height=\"200\"\u003e\u003c/p\u003e\n\n#### 5. Balanced Dirichlet partition\n\nNon-iid partition used in [[5]](#5). Each client has same number of samples, while class distribution in each client follows Dirichlet distribution $\\text{Dir}{(\\alpha)}$.\n\nGiven $\\alpha=0.3$, 100 clients and CIFAR10, the data samples assigned to the first 10 clients could be:\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/cifar10_balance_dir_alpha_0.3_100clients.png\" height=\"200\"\u003e\u003c/p\u003e\n\n#### 6. Unbalanced Dirichlet partition\n\nNon-iid partition used in [[5]](#5). Sample numbers of clients are drawn from Log-normal distribution $\\text{Log-N}(0, \\sigma^2)$, while class distribution in each client follows Dirichlet distribution $\\text{Dir}{(\\alpha)}$.\n\nGiven $\\sigma=0.3$, $\\alpha=0.3$, 100 clients and CIFAR10, the data samples assigned to the first 10 clients is showed left below. And distribution of sample number for clients is showed right below.\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/cifar10_unbalance_dir_alpha_0.3_unbalance_sgm_0.3_100clients.png\" height=\"200\"\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp;\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/cifar10_unbalance_dir_alpha_0.3_unbalance_sgm_0.3_100clients_dist.png\" height=\"200\"\u003e\u003c/p\u003e\n\n#### 7. Quantity-based Label Distribution Skew partition\n\nNon-iid partition used in [[1]](#1). Each client has only specific number of sample class.\n\nGiven class number for each client as $3$, 10 clients and FashionMNIST, the data samples assigned to each client could be:\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/fmnist_noniid-label_3_clients_10.png\" height=\"200\"\u003e\u003c/p\u003e\n\n#### 8. Noise-based Feature Distribution Skew partition\n\nNon-iid partition used in [[1]](#1). Different client's sample feature has different levels of Gaussian noise. Data example for 10 clients could be:\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/fmnist_feature_skew_vis.png\" height=\"400\"\u003e\u003c/p\u003e\n\n#### 9. FCUBE Synthetic partition\n\nNon-iid partition used in [[1]](#1). Data example for 4 clients could be shown as:\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/fcube_synthetic_part.png\" height=\"600\"\u003e\u003c/p\u003e\n\n### Datasets supported\n\n\u003ctable style=\"height: 458px;\"\u003e\n\u003ctbody\u003e\n\u003ctr style=\"height: 45px;\"\u003e\n  \u003ctd style=\"height: 45px;\"\u003e\u003cb\u003eData Type\u003c/b\u003e\u003c/td\u003e\n  \u003ctd style=\"height: 45px;\"\u003e\u003cb\u003eData Name\u003c/b\u003e\u003c/td\u003e\n  \u003ctd style=\"height: 45px;\"\u003e\u003cb\u003e#Training Samples\u003c/b\u003e\u003c/td\u003e\n  \u003ctd style=\"height: 45px;\"\u003e\u003cb\u003e#Test Samples\u003c/b\u003e\u003c/td\u003e\n  \u003ctd style=\"height: 45px;\"\u003e\u003cb\u003e#Label Classes\u003c/b\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n  \u003ctd style=\"height: 168px;\" rowspan=\"7\"\u003e\u003cb\u003eVision data\u003c/b\u003e\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003eCIFAR10\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e50K\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e\u0026nbsp;10K\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e10\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n\u003ctd style=\"height: 24px;\"\u003eCIFAR100\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e50K\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e10K\u0026nbsp;\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e100\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n\u003ctd style=\"height: 24px;\"\u003eFashionMNIST\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e60K\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e10K\u0026nbsp;\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e10\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n\u003ctd style=\"height: 24px;\"\u003eMNIST\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e60K\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e10K\u0026nbsp;\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e10\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n\u003ctd style=\"height: 24px;\"\u003eSVHN\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e73K\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e26K\u0026nbsp;\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e10\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n\u003ctd style=\"height: 24px;\"\u003eCelebA\u003c/td\u003e\n\u003ctd style=\"height: 24px;\" colspan=\"2\"\u003e200, 288\u0026nbsp;\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e2\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n\u003ctd style=\"height: 24px;\"\u003eFEMNIST\u003c/td\u003e\n\u003ctd style=\"height: 24px;\" colspan=\"2\"\u003e805, 263\u0026nbsp;\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e62\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n  \u003ctd style=\"height: 72.8239px;\" rowspan=\"3\"\u003e\u003cb\u003eText data\u003c/b\u003e\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003eShakespeare\u003c/td\u003e\n\u003ctd style=\"height: 24px;\" colspan=\"2\"\u003e4, 226, 158\u0026nbsp;\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e-\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n\u003ctd style=\"height: 24px;\"\u003eSent14\u003c/td\u003e\n\u003ctd style=\"height: 24px;\" colspan=\"2\"\u003e1, 600, 498\u0026nbsp;\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e3\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24.8239px;\"\u003e\n\u003ctd style=\"height: 24.8239px;\"\u003eReddit\u003c/td\u003e\n\u003ctd style=\"height: 24.8239px;\" colspan=\"2\"\u003e56, 587, 343\u0026nbsp;\u003c/td\u003e\n\u003ctd style=\"height: 24.8239px;\"\u003e-\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n  \u003ctd style=\"height: 72px;\" rowspan=\"3\"\u003e\u003cb\u003eTabular data\u003c/b\u003e\u003c/td\u003e\n  \u003ctd style=\"height: 24px;\"\u003e\u003ca href=\"https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#a9a\" target=\"_blank\"\u003eAdult\u003c/a\u003e\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e32, 561\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e\u0026nbsp;16, 281\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e2\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n  \u003ctd style=\"height: 24px;\"\u003e\u003ca href=\"https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#covtype.binary\" target=\"_blank\"\u003eCovtype\u003c/a\u003e\u003c/td\u003e\n\u003ctd style=\"height: 24px;\" colspan=\"2\"\u003e\u0026nbsp;581, 012\u0026nbsp;\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e2\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n  \u003ctd style=\"height: 24px;\"\u003e\u003ca href=\"https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#rcv1.binary\" target=\"_blank\"\u003eRCV1 binary\u003c/a\u003e\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e20, 242\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e\u0026nbsp;677, 399\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e2\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n  \u003ctd style=\"height: 48px;\" rowspan=\"2\"\u003e\u003cb\u003eSynthetic data\u003c/b\u003e\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003eFCUBE\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e\u0026nbsp;-\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e\u0026nbsp;-\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e2\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr style=\"height: 24px;\"\u003e\n\u003ctd style=\"height: 24px;\"\u003eLEAF-Synthetic\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e\u0026nbsp;-\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e\u0026nbsp;-\u003c/td\u003e\n\u003ctd style=\"height: 24px;\"\u003e\u0026nbsp;-\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\n\n### Partition Visualization\n\nFor data distribution visualization in data partition, we provide `fedlab.utils.dataset.functional.feddata_scatterplot()` for users' convenience.\n\nVisualization for synthetic partition code below:\n```python\nimport numpy as np\nfrom matplotlib import pyplot as plt\nfrom fedlab.utils.dataset.functional import feddata_scatterplot\n\nsample_num = 15\nclass_num = 4\nclients_num = 3\nnum_per_client = int(sample_num/clients_num)\nlabels = np.random.randint(class_num, size=sample_num)  # generate 15 labels, each label is 0 to 3\nrand_per = np.random.permutation(sample_num)\n# partition synthetic data into 3 clients\ndata_indices = {0: rand_per[0:num_per_client],\n                1: rand_per[num_per_client:num_per_client*2],\n                2: rand_per[num_per_client*2:num_per_client*3]}\ntitle = 'Data Distribution over Clients for Each Class'\nfig = feddata_scatterplot(labels.tolist(),\n                          data_indices,\n                          clients_num,\n                          class_num,\n                          figsize=(6, 4),\n                          max_size=200,\n                          title=title)\nplt.show(fig)\nfig.savefig(f'imgs/feddata-scatterplot-vis.png') \n```\n\u003cp align=\"center\"\u003e\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/feddata-scatterplot-vis.png\" height=\"300\"\u003e\u003c/p\u003e\n\n\nVisualization result for CIFAR-10 Dirichlet Non-IID with $\\alpha=0.6$ on 5 clients:\n\u003cp align=\"center\"\u003e\u003cimg src=\"./tutorials/Datasets-DataPartitioner-tutorials/imgs/train_vis-noniid-labeldir.png\" height=\"300\"\u003e\u003c/p\u003e\n\n\n## Performance \u0026 Insights\n\nWe provide the performance report of several reproduced federated learning algorithms to illustrate the correctness of FedLab in simulation. Furthermore, we describe several insights FedLab could provide for federated learning research. Without loss of generality, this section's experiments are conducted on partitioned MNIST datasets. The conclusions and observations in this section should still be valid in other data sets and scenarios.\n\n### Federated Optimization on Non-IID Data\n\nWe choose $\\alpha = [0.1, 0.3, 0.5, 0.7]$ in label Dirichlet partitioned MNIST with 100 clients. We run 200 rounds of FedAvg with 5 local batches with full batch, learning rate 0.1, and sample ratio 0.1 (10 clients for each FL round). The test accuracy over the communication round is shown below. The results reveal the most vital challenge in federated learning. \n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./examples/imgs/non_iid_impacts_on_fedavg.jpg\" height=\"300\"\u003e\u003c/p\u003e\n\nWe use the same partitioned MNIST dataset in FedAvg[[4]](#4) to evaluate the corectness of FedLab. The rounds for FedAvg to achieve 97% test accuracy on MNIST using 2NN with E=5 reported in [[4]](#4)  /  FedLab:\n\u003ctable\u003e\n   \u003ctr\u003e\n      \u003ctd rowspan=\"2\"\u003eSample ratio\u003c/td\u003e\n      \u003ctd colspan=\"2\"\u003eIID\u003c/td\u003e\n      \u003ctd colspan=\"2\"\u003eNon-IID\u003c/td\u003e\n   \u003c/tr\u003e\n   \u003ctr\u003e\n      \u003ctd\u003eB=FULL\u003c/td\u003e\n      \u003ctd\u003eB=10\u003c/td\u003e\n      \u003ctd\u003eB=FULL\u003c/td\u003e\n      \u003ctd\u003eB=10\u003c/td\u003e\n   \u003c/tr\u003e\n   \u003ctr\u003e\n      \u003ctd\u003e0.0\u003c/td\u003e\n      \u003ctd\u003e1455  /  1293\u003c/td\u003e\n      \u003ctd\u003e316  /  77 \u003c/td\u003e\n      \u003ctd\u003e4278  /  1815\u003c/td\u003e\n      \u003ctd\u003e3275  /  1056\u003c/td\u003e\n   \u003c/tr\u003e\n   \u003ctr\u003e\n      \u003ctd\u003e0.1       \u003c/td\u003e\n      \u003ctd\u003e1474  /  1230\u003c/td\u003e\n      \u003ctd\u003e87  /  43 \u003c/td\u003e\n      \u003ctd\u003e1796  /  2778\u003c/td\u003e\n      \u003ctd\u003e664  /  439\u003c/td\u003e\n   \u003c/tr\u003e\n   \u003ctr\u003e\n      \u003ctd\u003e0.2\u003c/td\u003e\n      \u003ctd\u003e1658  / 1234\u003c/td\u003e\n      \u003ctd\u003e77  /  37 \u003c/td\u003e\n      \u003ctd\u003e1528  /  2805\u003c/td\u003e\n      \u003ctd\u003e619  / 427  \u003c/td\u003e\n   \u003c/tr\u003e\n   \u003ctr\u003e\n      \u003ctd\u003e0.5\u003c/td\u003e\n      \u003ctd\u003e--  /  1229\u003c/td\u003e\n      \u003ctd\u003e75  /  36 \u003c/td\u003e\n      \u003ctd\u003e--  /  3034\u003c/td\u003e\n      \u003ctd\u003e443  / 474\u003c/td\u003e\n   \u003c/tr\u003e \n   \u003ctr\u003e\n      \u003ctd\u003e1.0\u003c/td\u003e\n      \u003ctd\u003e--  /  1284\u003c/td\u003e\n      \u003ctd\u003e70  /  35 \u003c/td\u003e\n      \u003ctd\u003e--  /  3154\u003c/td\u003e\n      \u003ctd\u003e380  /  507\u003c/td\u003e\n   \u003c/tr\u003e\n\u003c/table\u003e\n\nThe results are obtained by running the [tutorial](https://github.com/SMILELab-FL/FedLab/blob/master/tutorials/readme_exp.ipynb) with random seed 0. \n\n### Simulation Efficiency\n\nTime cost in 100 rounds (50 clients are sampled per round) under different acceleration settings. 1M-10P stands for the simulation runs on 1 machine with 4 GPUs and 10 processes. 2M-10P stands for the simulation runs on 2 machines with 4 GPUs and 10 processes (5 processes on each machine). \n\nHardware platform: Intel(R) Xeon(R) Gold 6240L CPU @ 2.60GHz + Tesla V100 * 4.\n\n| Standalone  | Cross-process 1M-10P | Cross-process 2M-10P |\n| ----------  | ------------------------- | --------------------------- |\n|  45.6 Min   |     2.9 Min               |              4.23 Min       |\n\nThe results are obtained by running the [tutorial](https://github.com/SMILELab-FL/FedLab/blob/master/tutorials/readme_exp.ipynb) and an example of cross-process scenario. Besides, the results reveal the simulation efficiency of FedLab under different simulation modes. Cross-process with 2 machines could be slower in this setting due to communication bottleneck.\n\n\n### Communication Efficiency\n\nWe provide a few performance baselines in communication-efficient federated learning including QSGD and top-k. In the experiment setting, we choose $\\alpha = 0.5$ in the label Dirichlet partitioned MNIST with 100 clients. We run 200 rounds with a sample ratio of 0.1 (10 clients for each FL round) of FedAvg, where each client performs 5 local epochs of SGD with a full batch and learning rate of 0.1. We report the top-1 test accuracy and its communication volume during the training.\n\n| Setting              | Baseline | QSGD-4bit | QSGD-8bit | QSGD-16bit | Top-5% | Top-10% | Top-20% |\n| -------------------- | -------- | --------- | --------- | ---------- | ------ | ------- | ------- |\n| Test Accuracy (%)    |  93.14   |  93.03    |  93.27    |  93.11     |  11.35 |  61.25  |  89.96  |\n| Communication (MB)   |  302.45  |  45.59    |  85.06    |  160.67    |  0.94  |  1.89   |   3.79  |\n\nThe above results are obtained by running the [tutorial](https://github.com/SMILELab-FL/FedLab/blob/master/tutorials/communication_tutorial.ipynb).\n\n## Citation\n\nPlease cite __FedLab__ in your publications if it helps your research:\n\n```bibtex\n@article{JMLR:v24:22-0440,\n  author  = {Dun Zeng and Siqi Liang and Xiangjing Hu and Hui Wang and Zenglin Xu},\n  title   = {FedLab: A Flexible Federated Learning Framework},\n  journal = {Journal of Machine Learning Research},\n  year    = {2023},\n  volume  = {24},\n  number  = {100},\n  pages   = {1--7},\n  url     = {http://jmlr.org/papers/v24/22-0440.html}\n}\n```\nor\n```\n@article{zeng2021fedlab,\n  title={Fedlab: A flexible federated learning framework},\n  author={Zeng, Dun and Liang, Siqi and Hu, Xiangjing and Wang, Hui and Xu, Zenglin},\n  journal={arXiv preprint arXiv:2107.11621},\n  year={2021}\n}\n```\n\n## Contact\n\nProject Investigator: [Prof. Zenglin Xu](https://scholar.google.com/citations?user=gF0H9nEAAAAJ\u0026hl=en) (xuzenglin@hit.edu.cn).\n\nFor technical issues related to __FedLab__ development, please contact our development team through Github issues or email:\n\n- [Dun Zeng](https://scholar.google.com/citations?user=CuNFd3EAAAAJ\u0026hl=en): zengdun@foxmail.com\n- [Siqi Liang](https://scholar.google.com/citations?user=LIjv5BsAAAAJ\u0026hl=en): zszxlsq@gmail.com\n\n\n\n## References\n\n\u003ca id=\"1\"\u003e[1]\u003c/a\u003e Li, Q., Diao, Y., Chen, Q., \u0026 He, B. (2022, May). Federated learning on non-iid data silos: An experimental study. In *2022 IEEE 38th International Conference on Data Engineering (ICDE)* (pp. 965-978). IEEE.\n\n\u003ca id=\"2\"\u003e[2]\u003c/a\u003e Caldas, S., Duddu, S. M. K., Wu, P., Li, T., Konečný, J., McMahan, H. B., ... \u0026 Talwalkar, A. (2018). Leaf: A benchmark for federated settings. *arXiv preprint arXiv:1812.01097*.\n\n\u003ca id=\"3\"\u003e[3]\u003c/a\u003e Yurochkin, M., Agarwal, M., Ghosh, S., Greenewald, K., Hoang, N., \u0026 Khazaeni, Y. (2019, May). Bayesian nonparametric federated learning of neural networks. In *International Conference on Machine Learning* (pp. 7252-7261). PMLR.\n\n\u003ca id=\"4\"\u003e[4]\u003c/a\u003e McMahan, B., Moore, E., Ramage, D., Hampson, S., \u0026 y Arcas, B. A. (2017, April). Communication-efficient learning of deep networks from decentralized data. In *Artificial intelligence and statistics* (pp. 1273-1282). PMLR.\n\n\u003ca id=\"5\"\u003e[5]\u003c/a\u003e Acar, D. A. E., Zhao, Y., Navarro, R. M., Mattina, M., Whatmough, P. N., \u0026 Saligrama, V. (2021). Federated learning based on dynamic regularization. *arXiv preprint arXiv:2111.04263*.\n\n\u003ca id=\"6\"\u003e[6]\u003c/a\u003e Wang, H., Yurochkin, M., Sun, Y., Papailiopoulos, D., \u0026 Khazaeni, Y. (2020). Federated learning with matched averaging. *arXiv preprint arXiv:2002.06440*.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSMILELab-FL%2FFedLab","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FSMILELab-FL%2FFedLab","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSMILELab-FL%2FFedLab/lists"}