{"id":48818231,"url":"https://github.com/qinggo/engram-peft","last_synced_at":"2026-04-14T13:02:10.443Z","repository":{"id":350984625,"uuid":"1208593986","full_name":"QingGo/engram-peft","owner":"QingGo","description":"🚀 Engram-PEFT: An unofficial implementation of DeepSeek Engram. Inject high-capacity conditional memory into LLMs via sparse retrieval PEFT without increasing inference FLOPs / DeepSeek Engram 架构的非官方实现。通过参数高效微调 (PEFT) 为大语言模型注入超大规模条件记忆，支持稀疏更新且不增加推理开销。","archived":false,"fork":false,"pushed_at":"2026-04-13T05:16:55.000Z","size":1158,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-04-13T05:22:27.593Z","etag":null,"topics":["deepseek","engram","fine-tuning","llm","peft","pytorch","sparse-retrieval"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/QingGo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-12T13:53:23.000Z","updated_at":"2026-04-13T05:16:24.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/QingGo/engram-peft","commit_stats":null,"previous_names":["qinggo/engram-peft"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/QingGo/engram-peft","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QingGo%2Fengram-peft","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QingGo%2Fengram-peft/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QingGo%2Fengram-peft/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QingGo%2Fengram-peft/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/QingGo","download_url":"https://codeload.github.com/QingGo/engram-peft/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QingGo%2Fengram-peft/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31797376,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-14T11:13:53.975Z","status":"ssl_error","status_checked_at":"2026-04-14T11:13:53.299Z","response_time":153,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deepseek","engram","fine-tuning","llm","peft","pytorch","sparse-retrieval"],"created_at":"2026-04-14T13:02:06.492Z","updated_at":"2026-04-14T13:02:10.436Z","avatar_url":"https://github.com/QingGo.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Engram-PEFT\n\n[English] | [中文](README_zh.md)\n\n\u003e [!IMPORTANT]\n\u003e This is an **unofficial implementation** of the DeepSeek Engram paper ([arXiv:2601.07372](https://arxiv.org/abs/2601.07372)). [DeepSeek-AI official demo is here](https://github.com/deepseek-ai/Engram).\n\n[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)\n[![Documentation](https://img.shields.io/badge/Docs-MkDocs-blue.svg)](https://qinggo.github.io/engram-peft/)\n\n**Engram-PEFT** is a high-performance, 100% paper-aligned implementation of the DeepSeek Engram architecture. It provides a Parameter-Efficient Fine-Tuning (PEFT) interface to inject conditional memory into any Transformer-based LLM.\n\nEngram decouples **static knowledge storage** from **dynamic reasoning** using a sparse retrieval mechanism, allowing models to scale their factual memory without increasing inference FLOPs or interfering with core logic.\n\n---\n\n## 🚀 Quick Start\n\n### Installation\n\n```bash\npip install engram-peft\n```\n\nTo run examples or contribute to development, install the project with development dependencies:\n\n```bash\n# Using uv (recommended)\nuv sync --all-groups\n\n# Using pip\npip install -e \".[dev]\"\n```\n\n### 5-Minute Example\n\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\nfrom engram_peft import EngramConfig, get_engram_model\n\n# 1. Load base model\nbase_model = AutoModelForCausalLM.from_pretrained(\"TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T\")\ntokenizer = AutoTokenizer.from_pretrained(\"TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T\")\n\n# 2. Inject Engram layers (aligned with arXiv:2601.07372)\nconfig = EngramConfig(target_layers=[2, 11, 20])\nmodel = get_engram_model(base_model, config, tokenizer)\n\n# 3. Quick check on trainable parameters\nmodel.print_trainable_parameters()\n# trainable params: 86,938,368 || all params: 1,186,986,752 || trainable%: 7.3243\n```\n\n---\n\n## 📊 Performance Comparison\n\n| Method | Params Added | Speed (s/step) | Training Loss | Eval Loss | Peak Memory (JSON) |\n| :--- | :--- | :--- | :--- | :--- | :--- |\n| **LoRA** (r=16) | ~2.25 M | **0.2738 s** | 1.231 | 0.9890 | 8.07 GB |\n| **Engram-PEFT** | **545.4 M** | 0.2961 s | 1.263 | 1.0165 | 9.38 GB |\n| **LoRA+Engram** | ~547.7 M | 0.3360 s | **1.214** | **0.9656** | 10.33 GB |\n\n\u003e [!TIP]\n\u003e **Performance Insight**: In our latest benchmark (Test 8 \u0026 9, TinyLlama-1.1B, 3000 steps), **LoRA+Engram** achieved the best convergence (lowest eval loss), outperforming standalone LoRA by ~2.3%. Engram-PEFT provides **240x more parameter capacity** (545M) for knowledge storage with minimal latency penalty. Use LoRA+Engram to leverage both structural adaptation and high-capacity sparse memory.\n\n### Loss Curve Comparison\n![Loss Curve Comparison](figures/loss_curve.png)\n\n*\\* Engram employs sparse lookup; only a tiny fraction of parameters (approx. 1%) are active and receive gradient updates per step. For a detailed breakdown of performance, computation, and memory, see our [Performance Analysis](docs/compare_engram_lora_analysis.md).*\n\n---\n\n## 🛠 Features\n\n- **100% Paper Alignment**: Implements Appendix A Table 5 parameters and the official DeepSeek gating/hashing logic.\n- **CPU Prefetching \u0026 Precomputation**: `EngramDataCollator` pre-calculates multi-head hash indices on the CPU. By using `num_workers \u003e 0`, these indices are prefetched in parallel with training, ensuring zero hashing overhead on the GPU.\n- **Tokenizer Compression**: Built-in NFKC and lowercase normalization for 23% vocabulary reduction.\n- **Cross-Model Weight Migration**: A unique feature (see `weight_transfer.py`) that allows migrating Engram weights between different models (e.g., Llama to Qwen) using character-level alignment on a corpus—effectively \"recycling\" learned knowledge.\n- **Zero-Invasive**: Injects via forward hooks; no modification to your base model architecture required.\n- **Peft-like API**: Familiar methods like `print_trainable_parameters()` and `save_pretrained()`.\n- **Combined Training (LoRA+Engram)**: Support for stacking adapters. Injects LoRA for structural fine-tuning and Engram for sparse knowledge retrieval in a single model.\n- **Named Adapters**: Industry-standard named adapter management (add/set/unload) for modular knowledge packs.\n- **Automated Training**: Native `EngramTrainer` with built-in sparse Adam support and automatic sync of optimizer hyperparameters.\n- **Flexible Layer Discovery**: Recursive logic to find transformer layers regardless of PEFT wrapper nesting.\n\n---\n\n## 📖 Documentation\n\nFor full details, see our documentation:\n- [Tutorials](docs/tutorial.md): Quickstart and domain knowledge injection.\n- [API Reference](docs/api.md): Detailed class and function documentation.\n- [Paper Alignment](docs/paper_alignment.md): How we match the DeepSeek research.\n\n---\n\n## 🎯 Citation\n\nIf you use this implementation in your research, please cite the original DeepSeek paper:\n\n```bibtex\n@article{deepseek2026engram,\n  title={Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models},\n  author={DeepSeek-AI},\n  journal={arXiv preprint arXiv:2601.07372},\n  year={2026}\n}\n```\n\n---\n\n## License\n\nApache License 2.0. See [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqinggo%2Fengram-peft","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqinggo%2Fengram-peft","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqinggo%2Fengram-peft/lists"}