{"id":13689858,"url":"https://github.com/PKU-Chengxu/FLASH","last_synced_at":"2025-05-02T06:31:33.641Z","repository":{"id":101385566,"uuid":"337952931","full_name":"PKU-Chengxu/FLASH","owner":"PKU-Chengxu","description":null,"archived":false,"fork":false,"pushed_at":"2022-06-02T03:10:05.000Z","size":5762,"stargazers_count":43,"open_issues_count":0,"forks_count":6,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-11-12T15:43:12.994Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PKU-Chengxu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2021-02-11T06:58:03.000Z","updated_at":"2024-10-20T14:38:23.000Z","dependencies_parsed_at":null,"dependency_job_id":"ae3adf69-3812-4e75-9272-dd5668ccc025","html_url":"https://github.com/PKU-Chengxu/FLASH","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Chengxu%2FFLASH","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Chengxu%2FFLASH/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Chengxu%2FFLASH/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Chengxu%2FFLASH/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PKU-Chengxu","download_url":"https://codeload.github.com/PKU-Chengxu/FLASH/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251998467,"owners_count":21677987,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T16:00:28.605Z","updated_at":"2025-05-02T06:31:29.236Z","avatar_url":"https://github.com/PKU-Chengxu.png","language":"Python","funding_links":[],"categories":["Real-world device traces"],"sub_categories":["Energy-efficiency"],"readme":"# FLASH\n\n- An Open Source *Heterogeneity-Aware* Federated Learning Platform\n- This repository is based on a fork of [Leaf](https://leaf.cmu.edu/), a benchmark for federated settings.\n\nThis repository contains the code and experiments for the paper:\n\n\u003e  [WWW'21](https://www2021.thewebconf.org/)\n\u003e\n\u003e [Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data]()\n\n## What is FLASH?\n\nBriefly speaking, we develop FLASH to incorporate **heterogeneity** into the federated learning simulation process. We mainly follow Google's [FL protocol](https://arxiv.org/pdf/1902.01046.pdf) to implement FLASH, so compared to other platforms, we add many additional system configurations, e.g., deadline. For these configurations, see more details in the [config file](#config).\n\n### Heterogeneity\n\n**Hardware Heterogeneity**: Each client is bundled with a device type. Each device type has different training speeds and network speeds. We also support self-defined device type(-1) whose parameter can be set manually for more complexed simulation. \n\nThe source code for measure the on-device training time is available in the [OnDeviceTraining](./OnDeviceTraining) directory\n\n**State(Behavior) Heterogeneity**: the state and running environment of participating clients can be various and dynamic. We follow [Google's FL system](https://arxiv.org/pdf/1902.01046.pdf), i.e., clients are available for training only when the device is idle, charging, and connected to WiFi. To simulate state heterogeneity, we provide a default state trace which can be accessed [here](./data/state_traces.json). This default trace is sampled from the large-scale real-world trace (as we use in our paper) that involves upto 136k devices.\n\nNote: FLASH will run in a heterogeneity-unaware (ideal) mode if trace file is not found or `hard_hete` and `behav_hete` are set to `False`\n\n\n\n## How to run it\n\n### example\n\n```bash\n# 1. Clone and install requirments\ngit clone https://github.com/PKU-Chengxu/FLASH.git\npip3 install -r requirements.txt\n\n# 2. Change state traces (optional)\n# We have a provided a default state traces containing 1000 devices' data, located at the ./data/ dir. \n# IF you want to use a self-collected traces, just modify the file path in [models/client.py](models/client.py), i.e. with open('/path/to/state_traces.json', 'r', encoding='utf-8') as f: \n\n# 3. Download a benchmark dataset, go to directory of respective dataset `data/$DATASET` for instructions on generating the benchmark dataset\n\n# 4. Run\ncd models/\npython3 main.py [--config yourconfig.cfg]\n# use --config option to specify the config-file, default.cfg will be used if not specified\n# the output log is CONFIG_FILENAME.log\n```\n\n\u003ch3 id=\"config\"\u003eConfig File\u003c/h3\u003e\nTo simplify the command line arguments, we move most of the parameters to a \u003cspan id=\"jump\"\u003econfig file\u003c/span\u003e. Below is a detailed example.\n\n```bash\n## whether to consider heterogeneity\nbehav_hete False # bool, whether to simulate state(behavior) heterogeneity\nhard_hete False # bool, whether to simulate hardware heterogeneity, which contains differential on-device training time and network speed\n\n\n## no training mode to tune system configurations\nno_training False # bool, whether to run in no_training mode, skip training process if True\n\n\n## ML related configurations\ndataset femnist # dataset to use\nmodel cnn # file that defines the DNN model\nlearning_rate 0.01 # learning-rate of DNN\nbatch_size 10 # batch-size for training \n\n\n## system configurations, refer to https://arxiv.org/abs/1812.02903 for more details\nnum_rounds 500 # number of FL rounds to run\nclients_per_round 100 # expected clients in each round\nmin_selected 60 # min selected clients number in each round, fail if not satisfied\nmax_sample 340 #  max number of samples to use in each selected client\neval_every 5 # evaluate every # rounds, -1 for not evaluate\nnum_epochs 5 # number of training epochs (E) for each client in each round\nseed 0 # basic random seed\nround_ddl 270 0 # μ and σ for deadline, which follows a normal distribution\nupdate_frac 0.8  # min update fraction in each round, round fails when fraction of clients that successfully upload their is not less than \"update_frac\"\nmax_client_num -1 # max number of clients in the simulation process, -1 for infinite\n\n\n### ----- NOTE! below are advanced configurations. \n### ----- Strongly recommend: specify these configurations only after reading the source code. \n### ----- Configuration items of [aggregate_algorithm, fedprox*, structure_k, qffl*] are mutually-exclusive \n\n## basic algorithm\naggregate_algorithm SucFedAvg # choose in [SucFedAvg, FedAvg], please refer to models/server.py for more details. In the configuration file, SucFedAvg refers to the \"FedAvg\" algorithm described in https://arxiv.org/pdf/1602.05629.pdf\n\n## compression algorithm\n# compress_algo grad_drop # gradiant compress algorithm, choose in [grad_drop, sign_sgd], not use if commented\n# structure_k 100\n## the k for structured update, not use if commented, please refer to the arxiv for more \n\n## advanced aggregation algorithms\n# fedprox True # whether to apply fedprox and params needed, please refer to the sysml'20 (https://arxiv.org/pdf/1812.06127.pdf) for more details\n# fedprox_mu 0.5\n# fedprox_active_frac 0.8\n\n# qffl True # whether to apply qffl(q-fedavg) and params needed, please refer to the ICLR'20 (https://arxiv.org/pdf/1905.10497.pdf) for more\n# qffl_q 5\n```\n\n\n## Benchmark Datasets\n\n#### FEMNIST\n\n- **Overview:** Image Dataset\n- **Details:** 62 different classes (10 digits, 26 lowercase, 26 uppercase), images are 28 by 28 pixels (with option to make them all 128 by 128 pixels), 3500 users\n- **Task:** Image Classification\n\n\n\n#### Celeba\n\n- **Overview:** Image Dataset based on the [Large-scale CelebFaces Attributes Dataset](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html)\n- **Details:** 9343 users (we exclude celebrities with less than 5 images)\n- **Task:** Image Classification (Smiling vs. Not smiling)\n\n\n\n#### Reddit\n\n- **Overview:** We preprocess the Reddit data released by [pushshift.io](https://files.pushshift.io/reddit/) corresponding to December 2017.\n- **Details:** 1,660,820 users with a total of 56,587,343 comments. \n- **Task:** Next-word Prediction.\n\n\n\n## Results in the paper\n\nConfig file and results are in the `paper_experiments` folder. You can just modify the `models/default.cfg` and then run `python main.py` to reproduce all the experiments in our paper. The experiments can be devided into the following categories:\n\n- Basic FL algorithm\n- Advanced FL algorithms\n- Breakdown of Heterogeneity\n- Device Failure\n- Participation Bias\n\nUpdate 06/02/2022:\n\nWe provide a shell to reproduce all the paper experiment. The shell is `\u003cFLASH\u003e/paper_experiments/run.sh`. You can run the script as following:\n\n```shell\ncd paper_experiments\nchmod +x run.sh\n./run.sh [CONFIG_FILE]\n```\n\nThis shell uses `models/default.cfg` by default, and you can specify the config file to use. `CONFIG_FILE` is the relative path to `paper_experiments` directory. All of the experiment config files we use are in `paper_experiments` folder.\n\n## On-device Training\n\nthe code we used to measure the on-device training time is in `OnDeviceTraining` folder. Please refer to the [doc](OnDeviceTraining/README.md) for more details\n\n\n\n## Notes\n\n- please consider to cite our paper if you use the code or data in your research project.\n\n\n\u003e ```\n\u003e @inproceedings{yang2019characterizing,\n\u003e   title={Characterizing impacts of heterogeneity in federated learning upon large-scale smartphone data},\n\u003e   author={Yang, Chengxu and Wang Qipeng and Xu, Mengwei and Chen, Zhenpeng and Bian Kaigui and Liu, Yunxin and Liu, Xuanzhe},\n\u003e   booktitle={The World Wide Web Conference},\n\u003e   year={2021}\n\u003e }\n\u003e ```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPKU-Chengxu%2FFLASH","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FPKU-Chengxu%2FFLASH","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPKU-Chengxu%2FFLASH/lists"}