https://github.com/Trae1ounG/Neural_Incompatibility
Official code for ACL'25 Main: "Neural Incompatibility: The Unbridgeable Gap of Cross-Scale Parametric Knowledge Transfer in Large Language Models"
https://github.com/Trae1ounG/Neural_Incompatibility
acl2025 interpretable-machine-learning llm llm-reasoning open-source
Last synced: 4 months ago
JSON representation
Official code for ACL'25 Main: "Neural Incompatibility: The Unbridgeable Gap of Cross-Scale Parametric Knowledge Transfer in Large Language Models"
- Host: GitHub
- URL: https://github.com/Trae1ounG/Neural_Incompatibility
- Owner: Trae1ounG
- Created: 2025-02-15T18:16:38.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-23T06:00:42.000Z (about 1 year ago)
- Last Synced: 2025-08-03T06:43:38.488Z (10 months ago)
- Topics: acl2025, interpretable-machine-learning, llm, llm-reasoning, open-source
- Language: Python
- Homepage:
- Size: 1.45 MB
- Stars: 6
- Watchers: 1
- Forks: 1
- Open Issues: 1
Awesome Lists containing this project
- StarryDivineSky - Trae1ounG/Neural_Incompatibility - Scale Parametric Knowledge Transfer in Large Language Models》的官方代码实现,聚焦于大语言模型(LLM)中跨尺度参数知识迁移的不可逾越性问题。研究指出,当尝试将超大规模模型(如GPT-3、PaLM)的参数知识迁移至较小模型(如LLaMA、BLOOM)时,存在显著的性能差距,这种“神经不兼容性”源于模型规模差异导致的结构化知识分布不匹配,而非单纯的数据或训练优化问题。项目通过系统性实验分析发现,即使使用相同训练数据和优化策略,小模型在知识迁移后仍难以复现大模型的推理能力,且这种差距随模型规模差异扩大而加剧。核心工作原理基于对参数知识迁移机制的量化分析,提出“跨尺度参数不兼容性指标”(Cross-Scale Parametric Incompatibility Metric),通过比较模型间参数分布差异、梯度流动特性及知识密度,揭示迁移过程中的结构性障碍。项目代码包含完整的实验框架,支持对不同模型规模(如1亿至1750亿参数)的知识迁移效果评估,并提供可视化工具分析参数级差异。研究结论对模型蒸馏、知识迁移技术及LLM架构设计具有重要指导意义,强调了模型规模与知识迁移效率之间的本质矛盾,为未来跨尺度模型协作研究提供了理论依据。 (A01_文本生成_文本对话 / 大语言对话模型及数据)