{"id":13644232,"url":"https://github.com/conceptofmind/lamda-rlhf-pytorch","last_synced_at":"2025-04-21T07:30:31.443Z","repository":{"id":37621471,"uuid":"505870568","full_name":"conceptofmind/LaMDA-rlhf-pytorch","owner":"conceptofmind","description":"Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.","archived":true,"fork":false,"pushed_at":"2024-02-24T16:22:47.000Z","size":148,"stargazers_count":472,"open_issues_count":7,"forks_count":75,"subscribers_count":22,"default_branch":"main","last_synced_at":"2025-04-14T04:08:26.436Z","etag":null,"topics":["artificial-intelligence","attention-mechanism","deep-learning","human-feedback","machine-learning","reinforcement-learning","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/conceptofmind.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["lucidrains","conceptofmind"],"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"lfx_crowdfunding":null,"custom":null}},"created_at":"2022-06-21T14:08:46.000Z","updated_at":"2025-04-01T16:11:58.000Z","dependencies_parsed_at":"2024-02-24T17:30:24.860Z","dependency_job_id":"d4bfa75f-bf49-4c79-9e24-27073fa419ae","html_url":"https://github.com/conceptofmind/LaMDA-rlhf-pytorch","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/conceptofmind%2FLaMDA-rlhf-pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/conceptofmind%2FLaMDA-rlhf-pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/conceptofmind%2FLaMDA-rlhf-pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/conceptofmind%2FLaMDA-rlhf-pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/conceptofmind","download_url":"https://codeload.github.com/conceptofmind/LaMDA-rlhf-pytorch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250014534,"owners_count":21360969,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","attention-mechanism","deep-learning","human-feedback","machine-learning","reinforcement-learning","transformers"],"created_at":"2024-08-02T01:01:59.479Z","updated_at":"2025-04-21T07:30:31.096Z","avatar_url":"https://github.com/conceptofmind.png","language":"Python","funding_links":["https://github.com/sponsors/lucidrains","https://github.com/sponsors/conceptofmind"],"categories":["Reimplementations"],"sub_categories":[],"readme":"\u003cimg src=\"./lamda.png\" width=\"600px\"\u003e\u003c/img\u003e\n\n## LaMDA-pytorch\nOpen-source pre-training implementation of Google's [LaMDA research paper](https://arxiv.org/abs/2201.08239) in PyTorch. The totally not sentient AI. This repository will cover the 2B parameter implementation of the pre-training architecture as that is likely what most can afford to train. You can review Google's latest blog post from 2022 which details LaMDA [here](https://ai.googleblog.com/2022/01/lamda-towards-safe-grounded-and-high.html). You can also view their previous blog post from 2021 on the model [here](https://blog.google/technology/ai/lamda/).\n\n## Acknowledgement:\nI have been greatly inspired by the work of [Dr. Phil 'Lucid' Wang](https://github.com/lucidrains). Please check out his [open-source implementations](https://github.com/lucidrains) of multiple different transformer architectures and [support](https://github.com/sponsors/lucidrains) his work.\n\n## Developer Updates\nDeveloper updates can be found on: \n- https://twitter.com/EnricoShippole\n- https://www.linkedin.com/in/enrico-shippole-495521b8/\n\n## Basic Usage - Pre-training\n```python\nlamda_base = LaMDA(\n    num_tokens = 20000,\n    dim = 512,\n    dim_head = 64,\n    depth = 12,\n    heads = 8\n)\n\nlamda = AutoregressiveWrapper(lamda_base, max_seq_len = 512)\n\ntokens = torch.randint(0, 20000, (1, 512)) # mock token data\n\nlogits = lamda(tokens)\n\nprint(logits)\n```\n\n## Notes on training at scale:\n- [Pipeline parallelism should be used with ZeRO 1, not ZeRO 2.](https://github.com/microsoft/DeepSpeed/discussions/1911)\n\n## About LaMDA:\n- T5 Relative Positional Bias in Attention\n- Gated GELU Activation in the Feed forward layer\n- GPT-like Decoder Only architecture\n- Autoregressive with Top-k sampling\n- Sentencepiece Byte-pair encoded tokenizer\n\n## TODO:\n- [x] Finish building pre-training model architecture\n- [x] Add pre-training script\n- [x] Integrate [Huggingface datasets](https://huggingface.co/docs/datasets/index)\n- [x] Implement GPT-2 tokenizer\n- [ ] Add Sentencepiece tokenizer training script and integration\n- [ ] Add detailed documentation\n- [x] Add logging with [Weights And Biases](https://wandb.ai/site)\n- [x] Add scaling with ColossalAI.\n- [ ] Add finetuning script\n- [ ] Add pip installer with PyPI\n- [ ] Add inference only if someone wants to open-source LaMDA model weights\n\n## Author\n- Enrico Shippole\n\n## Citations\n```bibtex\n@article{DBLP:journals/corr/abs-2201-08239,\n  author    = {Romal Thoppilan and\n               Daniel De Freitas and\n               Jamie Hall and\n               Noam Shazeer and\n               Apoorv Kulshreshtha and\n               Heng{-}Tze Cheng and\n               Alicia Jin and\n               Taylor Bos and\n               Leslie Baker and\n               Yu Du and\n               YaGuang Li and\n               Hongrae Lee and\n               Huaixiu Steven Zheng and\n               Amin Ghafouri and\n               Marcelo Menegali and\n               Yanping Huang and\n               Maxim Krikun and\n               Dmitry Lepikhin and\n               James Qin and\n               Dehao Chen and\n               Yuanzhong Xu and\n               Zhifeng Chen and\n               Adam Roberts and\n               Maarten Bosma and\n               Yanqi Zhou and\n               Chung{-}Ching Chang and\n               Igor Krivokon and\n               Will Rusch and\n               Marc Pickett and\n               Kathleen S. Meier{-}Hellstern and\n               Meredith Ringel Morris and\n               Tulsee Doshi and\n               Renelito Delos Santos and\n               Toju Duke and\n               Johnny Soraker and\n               Ben Zevenbergen and\n               Vinodkumar Prabhakaran and\n               Mark Diaz and\n               Ben Hutchinson and\n               Kristen Olson and\n               Alejandra Molina and\n               Erin Hoffman{-}John and\n               Josh Lee and\n               Lora Aroyo and\n               Ravi Rajakumar and\n               Alena Butryna and\n               Matthew Lamm and\n               Viktoriya Kuzmina and\n               Joe Fenton and\n               Aaron Cohen and\n               Rachel Bernstein and\n               Ray Kurzweil and\n               Blaise Aguera{-}Arcas and\n               Claire Cui and\n               Marian Croak and\n               Ed H. Chi and\n               Quoc Le},\n  title     = {LaMDA: Language Models for Dialog Applications},\n  journal   = {CoRR},\n  volume    = {abs/2201.08239},\n  year      = {2022},\n  url       = {https://arxiv.org/abs/2201.08239},\n  eprinttype = {arXiv},\n  eprint    = {2201.08239},\n  timestamp = {Fri, 22 Apr 2022 16:06:31 +0200},\n  biburl    = {https://dblp.org/rec/journals/corr/abs-2201-08239.bib},\n  bibsource = {dblp computer science bibliography, https://dblp.org}\n}\n```\n```bibtex\n@misc{https://doi.org/10.48550/arxiv.1706.03762,\n  doi = {10.48550/ARXIV.1706.03762},\n  \n  url = {https://arxiv.org/abs/1706.03762},\n  \n  author = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia},\n  \n  keywords = {Computation and Language (cs.CL), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},\n  \n  title = {Attention Is All You Need},\n  \n  publisher = {arXiv},\n  \n  year = {2017},\n  \n  copyright = {arXiv.org perpetual, non-exclusive license}\n}\n```\n```bibtex\n@misc{https://doi.org/10.48550/arxiv.1910.10683,\n  doi = {10.48550/ARXIV.1910.10683},\n  \n  url = {https://arxiv.org/abs/1910.10683},\n  \n  author = {Raffel, Colin and Shazeer, Noam and Roberts, Adam and Lee, Katherine and Narang, Sharan and Matena, Michael and Zhou, Yanqi and Li, Wei and Liu, Peter J.},\n  \n  keywords = {Machine Learning (cs.LG), Computation and Language (cs.CL), Machine Learning (stat.ML), FOS: Computer and information sciences, FOS: Computer and information sciences},\n  \n  title = {Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer},\n  \n  publisher = {arXiv},\n  \n  year = {2019},\n  \n  copyright = {arXiv.org perpetual, non-exclusive license}\n}\n```\n```bibtex\n@misc{https://doi.org/10.48550/arxiv.2002.05202,\n  doi = {10.48550/ARXIV.2002.05202},\n  \n  url = {https://arxiv.org/abs/2002.05202},\n  \n  author = {Shazeer, Noam},\n  \n  keywords = {Machine Learning (cs.LG), Neural and Evolutionary Computing (cs.NE), Machine Learning (stat.ML), FOS: Computer and information sciences, FOS: Computer and information sciences},\n  \n  title = {GLU Variants Improve Transformer},\n  \n  publisher = {arXiv},\n  \n  year = {2020},\n  \n  copyright = {arXiv.org perpetual, non-exclusive license}\n}\n```\n```bibtex\n@article{DBLP:journals/corr/abs-2101-00027,\n  author    = {Leo Gao and\n               Stella Biderman and\n               Sid Black and\n               Laurence Golding and\n               Travis Hoppe and\n               Charles Foster and\n               Jason Phang and\n               Horace He and\n               Anish Thite and\n               Noa Nabeshima and\n               Shawn Presser and\n               Connor Leahy},\n  title     = {The Pile: An 800GB Dataset of Diverse Text for Language Modeling},\n  journal   = {CoRR},\n  volume    = {abs/2101.00027},\n  year      = {2021},\n  url       = {https://arxiv.org/abs/2101.00027},\n  eprinttype = {arXiv},\n  eprint    = {2101.00027},\n  timestamp = {Thu, 14 Oct 2021 09:16:12 +0200},\n  biburl    = {https://dblp.org/rec/journals/corr/abs-2101-00027.bib},\n  bibsource = {dblp computer science bibliography, https://dblp.org}\n}\n```\n```bibtex\n@article{DBLP:journals/corr/abs-1808-06226,\n  author    = {Taku Kudo and\n               John Richardson},\n  title     = {SentencePiece: {A} simple and language independent subword tokenizer\n               and detokenizer for Neural Text Processing},\n  journal   = {CoRR},\n  volume    = {abs/1808.06226},\n  year      = {2018},\n  url       = {http://arxiv.org/abs/1808.06226},\n  eprinttype = {arXiv},\n  eprint    = {1808.06226},\n  timestamp = {Sun, 02 Sep 2018 15:01:56 +0200},\n  biburl    = {https://dblp.org/rec/journals/corr/abs-1808-06226.bib},\n  bibsource = {dblp computer science bibliography, https://dblp.org}\n}\n```\n```bibtex\n@inproceedings{sennrich-etal-2016-neural,\n    title = \"Neural Machine Translation of Rare Words with Subword Units\",\n    author = \"Sennrich, Rico  and\n      Haddow, Barry  and\n      Birch, Alexandra\",\n    booktitle = \"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)\",\n    month = aug,\n    year = \"2016\",\n    address = \"Berlin, Germany\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/P16-1162\",\n    doi = \"10.18653/v1/P16-1162\",\n    pages = \"1715--1725\",\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fconceptofmind%2Flamda-rlhf-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fconceptofmind%2Flamda-rlhf-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fconceptofmind%2Flamda-rlhf-pytorch/lists"}