{"id":13455976,"url":"https://github.com/ridgerchu/matmulfreellm","last_synced_at":"2025-05-14T02:05:28.739Z","repository":{"id":242738999,"uuid":"790957648","full_name":"ridgerchu/matmulfreellm","owner":"ridgerchu","description":"Implementation for MatMul-free LM.","archived":false,"fork":false,"pushed_at":"2024-11-05T23:38:38.000Z","size":1590,"stargazers_count":2996,"open_issues_count":24,"forks_count":186,"subscribers_count":46,"default_branch":"master","last_synced_at":"2025-05-07T01:09:11.049Z","etag":null,"topics":["large-language-model","linear-transformer","llm"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ridgerchu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-23T20:44:31.000Z","updated_at":"2025-05-04T15:30:24.000Z","dependencies_parsed_at":"2024-09-16T05:02:41.448Z","dependency_job_id":"0d890750-7069-4c8d-a848-267547c042e7","html_url":"https://github.com/ridgerchu/matmulfreellm","commit_stats":null,"previous_names":["ridgerchu/matmulfreellm"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ridgerchu%2Fmatmulfreellm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ridgerchu%2Fmatmulfreellm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ridgerchu%2Fmatmulfreellm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ridgerchu%2Fmatmulfreellm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ridgerchu","download_url":"https://codeload.github.com/ridgerchu/matmulfreellm/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254052692,"owners_count":22006716,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["large-language-model","linear-transformer","llm"],"created_at":"2024-07-31T08:01:14.300Z","updated_at":"2025-05-14T02:05:23.726Z","avatar_url":"https://github.com/ridgerchu.png","language":"Python","funding_links":[],"categories":["llm","A01_文本生成_文本对话","Python"],"sub_categories":["大语言对话模型及数据"],"readme":"\u003cdiv align=center\u003e\n\u003cimg src=\"__assets__/logo.png\" width=\"200px\"\u003e\n\u003c/div\u003e\n\u003ch2 align=\"center\"\u003eMatMul-Free LM\u003c/h2\u003e\n\u003ch5 align=\"center\"\u003e If you like our project, please give us a star ⭐ on GitHub for the latest updates.  \u003c/h2\u003e\n\u003ch5 align=\"center\"\u003e This repo is adapted from \u003ca href=\"https://github.com/sustcsonglin/flash-linear-attention\"\u003eflash-linear-attention\u003c/a\u003e. \u003c/h2\u003e\n\n\u003ch5 align=\"center\"\u003e\n\n[![hf_model](https://img.shields.io/badge/🤗-Models-blue.svg)](https://huggingface.co/collections/ridger/matmulfree-lm-665f4d2b4e4648756e0dd13c) [![arXiv](https://img.shields.io/badge/Arxiv-2406.02528-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2406.02528) \n# Introduction\n\u003cdiv align=center\u003e\n\u003cimg src=\"__assets__/main.png\"\u003e\n\u003c/div\u003e\nMatMul-Free LM is a language model architecture that eliminates the need for Matrix Multiplication (MatMul) operations. This repository provides an implementation of MatMul-Free LM that is compatible with the 🤗 Transformers library.\n\n# Scaling Law\n\u003cdiv align=center\u003e\n\u003cimg src=\"__assets__/scaling_law.png\"\u003e\n\u003c/div\u003e\nWe evaluate how the scaling law fits to the 370M, 1.3B and 2.7B parameter models in both Transformer++ and our model. For a fair comparison, each operation is treated identically, though our model uses more efficient ternary weights in some layers. Interestingly, the scaling projection for our model exhibits a steeper descent compared to Transformer++, suggesting our architecture is more efficient in leveraging additional compute to improve performance.\n\n# Installation\n\nThe following requirements should be satisfied \n- [PyTorch](https://pytorch.org/) \u003e= 2.0\n- [Triton](https://github.com/openai/triton) \u003e=2.2\n- [einops](https://einops.rocks/)\n\n```sh\npip install -U git+https://github.com/ridgerchu/matmulfreellm\n```\n\n# Usage\n## Pre-trained Model Zoo\n| Model Size     | Layer | Hidden dimension  | Trained tokens |\n|:----------------|:------------:|:----------------:|:------------------:|\n| [370M](https://huggingface.co/ridger/MMfreeLM-370M)  | 24  | 1024 | 15B  |\n| [1.3B](https://huggingface.co/ridger/MMfreeLM-1.3B)  | 24 | 2048 | 100B  |\n| [2.7B](https://huggingface.co/ridger/MMfreeLM-2.7B)  | 32  | 2560 | 100B  |\n\n## Model\n\nWe provide the implementations of models that are compatible with 🤗 Transformers library. \nHere's an example of how to initialize a model from the default configs in `matmulfreelm`:\nThis is a huggingface-compatible library that you can use such command to initialize the model with huggingface `AutoModel`:\n\n\n```py\n\u003e\u003e\u003e from mmfreelm.models import HGRNBitConfig\n\u003e\u003e\u003e from transformers import AutoModel\n\u003e\u003e\u003e config = HGRNBitConfig()\n\u003e\u003e\u003e AutoModel.from_config(config)\nHGRNBitModel(\n  (embeddings): Embedding(32000, 2048)\n  (layers): ModuleList(\n    (0): HGRNBitBlock(\n      (attn_norm): RMSNorm(2048, eps=1e-06)\n      (attn): HGRNBitAttention(\n        (i_proj): FusedBitLinear(\n          in_features=2048, out_features=2048, bias=False\n          (norm): RMSNorm(2048, eps=1e-08)\n        )\n        (f_proj): FusedBitLinear(\n          in_features=2048, out_features=2048, bias=False\n          (norm): RMSNorm(2048, eps=1e-08)\n        )\n        (g_proj): FusedBitLinear(\n          in_features=2048, out_features=2048, bias=False\n          (norm): RMSNorm(2048, eps=1e-08)\n        )\n        (g_norm): FusedRMSNormSwishGate()\n        (o_proj): FusedBitLinear(\n          in_features=2048, out_features=2048, bias=False\n          (norm): RMSNorm(2048, eps=1e-08)\n        )\n      )\n      (mlp_norm): RMSNorm(2048, eps=1e-06)\n      (mlp): HGRNBitMLP(\n        (gate_proj): FusedBitLinear(\n          in_features=2048, out_features=11264, bias=False\n          (norm): RMSNorm(2048, eps=1e-08)\n        )\n        (down_proj): FusedBitLinear(\n          in_features=5632, out_features=2048, bias=False\n          (norm): RMSNorm(5632, eps=1e-08)\n        )\n        (act_fn): SiLU()\n      )\n    )\n    \n)\n\u003e\u003e\u003e \n\n```\n\n## Generation\n\nUpon successfully pretraining a model, it becomes accessible for generating text using the 🤗 text generation APIs.\nIn the following, we give a generation example in `generate.py`:\n\n```py\nimport os\nos.environ[\"TOKENIZERS_PARALLELISM\"] = \"false\"\nimport mmfreelm\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n#Change here to our open-sourced model\nname = ''\ntokenizer = AutoTokenizer.from_pretrained(name)\nmodel = AutoModelForCausalLM.from_pretrained(name).cuda().half()\ninput_prompt = \"In a shocking finding, scientist discovered a herd of unicorns living in a remote, \"\ninput_ids = tokenizer(input_prompt, return_tensors=\"pt\").input_ids.cuda()\noutputs = model.generate(input_ids, max_length=32,  do_sample=True, top_p=0.4, temperature=0.6)\nprint(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])\n```\n\n\n\n# Citation\nIf you use this repo in your work, please cite our preprint:\n```bib\n@article{zhu2024scalable,\ntitle={Scalable MatMul-free Language Modeling},\nauthor={Zhu, Rui-Jie and Zhang, Yu and Sifferman, Ethan and Sheaves, Tyler and Wang, Yiqiao and Richmond, Dustin and Zhou, Peng and Eshraghian, Jason K},\njournal={arXiv preprint arXiv:2406.02528},\nyear={2024}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fridgerchu%2Fmatmulfreellm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fridgerchu%2Fmatmulfreellm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fridgerchu%2Fmatmulfreellm/lists"}