{"id":14964547,"url":"https://github.com/young-geng/easylm","last_synced_at":"2025-05-15T03:05:55.423Z","repository":{"id":77298556,"uuid":"569265598","full_name":"young-geng/EasyLM","owner":"young-geng","description":"Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.","archived":false,"fork":false,"pushed_at":"2024-08-13T05:55:05.000Z","size":387,"stargazers_count":2475,"open_issues_count":30,"forks_count":259,"subscribers_count":40,"default_branch":"main","last_synced_at":"2025-05-11T00:31:51.221Z","etag":null,"topics":["chatbot","deep-learning","flax","jax","language-model","large-language-models","llama","natural-language-processing","transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/young-geng.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-11-22T12:55:20.000Z","updated_at":"2025-05-04T01:08:52.000Z","dependencies_parsed_at":"2024-01-13T19:52:25.805Z","dependency_job_id":"81eae2f4-921b-4c49-a838-e75b7671a071","html_url":"https://github.com/young-geng/EasyLM","commit_stats":{"total_commits":216,"total_committers":11,"mean_commits":"19.636363636363637","dds":0.125,"last_synced_commit":"fe5b2c354e25d697fce7cd225e23bbbe72570da3"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/young-geng%2FEasyLM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/young-geng%2FEasyLM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/young-geng%2FEasyLM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/young-geng%2FEasyLM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/young-geng","download_url":"https://codeload.github.com/young-geng/EasyLM/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254264765,"owners_count":22041793,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatbot","deep-learning","flax","jax","language-model","large-language-models","llama","natural-language-processing","transformer"],"created_at":"2024-09-24T13:33:21.376Z","updated_at":"2025-05-15T03:05:55.401Z","avatar_url":"https://github.com/young-geng.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# EasyLM\nLarge language models (LLMs) made easy, EasyLM is a one stop solution for\npre-training, finetuning, evaluating and serving LLMs in JAX/Flax. EasyLM can\nscale up LLM training to hundreds of TPU/GPU accelerators by leveraging\nJAX's pjit functionality.\n\n\nBuilding on top of Hugginface's [transformers](https://huggingface.co/docs/transformers/main/en/index)\nand [datasets](https://huggingface.co/docs/datasets/index), this repo provides\nan easy to use and easy to customize codebase for training large language models\nwithout the complexity in many other frameworks.\n\n\nEasyLM is built with JAX/Flax. By leveraging JAX's pjit utility, EasyLM is able\nto train large models that don't fit on a single accelerator by sharding\nthe model weights and training data across multiple accelerators. Currently,\nEasyLM supports multiple TPU/GPU training in a single host as well as multi-host\ntraining on Google Cloud TPU Pods.\n\nCurrently, the following models are supported:\n* [LLaMA](https://arxiv.org/abs/2302.13971)\n* [LLaMA 2](https://arxiv.org/abs/2307.09288)\n* [LLaMA 3](https://llama.meta.com/llama3/)\n\n## Discord Server\nWe are running an unofficial Discord community (unaffiliated with Google) for discussion related to training LLMs in JAX. [Follow this link to join the Discord server](https://discord.gg/Rf4drG3Bhp). We have dedicated channels for several JAX based LLM frameworks, include EasyLM, [JaxSeq](https://github.com/Sea-Snell/JAXSeq), [Alpa](https://github.com/alpa-projects/alpa) and [Levanter](https://github.com/stanford-crfm/levanter).\n\n\n## Models Trained with EasyLM\n### OpenLLaMA\nOpenLLaMA is our permissively licensed reproduction of LLaMA which can be used\nfor commercial purposes. Check out the [project main page here](https://github.com/openlm-research/open_llama).\nThe OpenLLaMA can serve as drop in replacement for the LLaMA weights in EasyLM.\nPlease refer to the [LLaMA documentation](docs/llama.md) for more details.\n\n\n### Koala\nKoala is our new chatbot fine-tuned on top of LLaMA. If you are interested in\nour Koala chatbot, you can check out the [blogpost](https://bair.berkeley.edu/blog/2023/04/03/koala/)\nand [documentation for running it locally](docs/koala.md).\n\n\n## Installation\nThe installation method differs between GPU hosts and Cloud TPU hosts. The first\nstep is to pull from GitHub.\n\n``` shell\ngit clone https://github.com/young-geng/EasyLM.git\ncd EasyLM\nexport PYTHONPATH=\"${PWD}:$PYTHONPATH\"\n```\n\n#### Installing on GPU Host\nThe GPU environment can be installed via [Anaconda](https://www.anaconda.com/products/distribution).\n\n``` shell\nconda env create -f scripts/gpu_environment.yml\nconda activate EasyLM\n```\n\n#### Installing on Cloud TPU Host\nThe TPU host VM comes with Python and PIP pre-installed. Simply run the following\nscript to set up the TPU host.\n\n``` shell\n./scripts/tpu_vm_setup.sh\n```\n\n\n## [Documentations](docs/README.md)\nThe EasyLM documentations can be found in the [docs](docs/) directory.\n\n\n## Reference\nIf you found EasyLM useful in your research or applications, please cite using the following BibTeX:\n```\n@software{geng2023easylm,\n  author = {Geng, Xinyang},\n  title = {EasyLM: A Simple And Scalable Training Framework for Large Language Models},\n  month = March,\n  year = 2023,\n  url = {https://github.com/young-geng/EasyLM}\n}\n```\n\n\n\n## Credits\n* The LLaMA implementation is from [JAX_llama](https://github.com/Sea-Snell/JAX_llama)\n* The JAX/Flax GPT-J and RoBERTa implementation are from [transformers](https://huggingface.co/docs/transformers/main/en/index)\n* Most of the JAX utilities are from [mlxu](https://github.com/young-geng/mlxu)\n* The codebase is heavily inspired by [JAXSeq](https://github.com/Sea-Snell/JAXSeq)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyoung-geng%2Feasylm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyoung-geng%2Feasylm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyoung-geng%2Feasylm/lists"}