{"id":13611909,"url":"https://github.com/ParCIS/Chimera","last_synced_at":"2025-04-13T09:30:31.075Z","repository":{"id":118651500,"uuid":"372297124","full_name":"ParCIS/Chimera","owner":"ParCIS","description":"Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines. ","archived":false,"fork":false,"pushed_at":"2023-12-05T09:54:14.000Z","size":897,"stargazers_count":52,"open_issues_count":4,"forks_count":7,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-21T11:29:18.943Z","etag":null,"topics":["distributed-deep-learning","pipeline-parallelism","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ParCIS.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2021-05-30T19:18:24.000Z","updated_at":"2025-01-21T10:00:17.000Z","dependencies_parsed_at":null,"dependency_job_id":"8d9f0904-ea14-40b8-a72e-ae34401c000c","html_url":"https://github.com/ParCIS/Chimera","commit_stats":null,"previous_names":["parcis/chimera"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParCIS%2FChimera","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParCIS%2FChimera/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParCIS%2FChimera/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParCIS%2FChimera/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ParCIS","download_url":"https://codeload.github.com/ParCIS/Chimera/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248690704,"owners_count":21146191,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["distributed-deep-learning","pipeline-parallelism","transformers"],"created_at":"2024-08-01T20:00:16.065Z","updated_at":"2025-04-13T09:30:31.068Z","avatar_url":"https://github.com/ParCIS.png","language":"Python","funding_links":[],"categories":["Data Parallelism + Pipeline Parallelism (or Inter-layer Model Parallelism):"],"sub_categories":[],"readme":"## Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines\nBidirectional pipeline parallelism **Chimera** is pulished in **SC'21, Best Paper Finalist**. See the [paper](https://dl.acm.org/doi/abs/10.1145/3458817.3476145) and the [video talk](https://dl.acm.org/doi/abs/10.1145/3458817.3476145#sec-supp) for more details.\n\n![Chimera](ChimeraThumbnail.png)\n\n### Data preparation\nhttps://github.com/microsoft/AzureML-BERT/blob/master/docs/dataprep.md\n\nPlease store `wikipedia.segmented.nltk.txt` file under the `bert_data/` directory.\n\n### Installation\n```\npip install -r requirements.txt\n```\nFor training, we use `apex.optimizers.FusedLAMB` of [NVIDIA's Apex library](https://github.com/NVIDIA/apex). Please follow the [instruction](https://github.com/NVIDIA/apex#installation) for installing `apex`. \n\nFor profiling, we use [NVIDIA Nsight Systems](https://developer.nvidia.com/nsight-systems). Please make sure you can execute `nsys` command.\n\nOur scripts are intended to run through the SLURM workload manager on a GPU cluster with 1 GPU per node.\n\n### Profiling **Chimera** with 8 stages for BERT-Large on 8 GPUs \n```\nsbatch scripts/prof_steps.sh\n```\n```\nsh scripts/plot_cuda_timeline.sh\n```\noutput: `bert_prof/bert-large_chimera_8stages_8gpus_microbs32_acc1.pdf`\n\n\n\n### Publication\n\nTo cite our work:\n```bibtex\n@inproceedings{li143,\n  author = {Li, Shigang and Hoefler, Torsten},\n  title = {Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines},\n  year = {2021},\n  isbn = {9781450384421},\n  publisher = {Association for Computing Machinery},\n  address = {New York, NY, USA},\n  url = {https://doi.org/10.1145/3458817.3476145},\n  doi = {10.1145/3458817.3476145},\n  booktitle = {Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis},\n  articleno = {27},\n  numpages = {14},\n  location = {St. Louis, Missouri},\n  series = {SC '21}\n}\n\n```\n\n### License\n\nSee [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FParCIS%2FChimera","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FParCIS%2FChimera","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FParCIS%2FChimera/lists"}