https://github.com/wgwang/awesome-open-foundation-models

Open foundation models, such LLama2, ChatGLM, etc.
https://github.com/wgwang/awesome-open-foundation-models

Last synced: 3 months ago
JSON representation

Open foundation models, such LLama2, ChatGLM, etc.

Host: GitHub
URL: https://github.com/wgwang/awesome-open-foundation-models
Owner: wgwang
Created: 2023-12-13T00:56:55.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-05-15T03:55:48.000Z (about 1 year ago)
Last Synced: 2024-05-22T22:04:40.541Z (about 1 year ago)
Size: 1.88 MB
Stars: 61
Watchers: 5
Forks: 7
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-ai-list-guide - awesome-open-foundation-models

README

        # 开源开放基础大模型

旨在记录开源开发大模型发展情况，随时更新，欢迎在**Issues**中提供提供**线索**和**素材**

使用数据请注明来源：**微信公众号：走向未来** 和 **仓库：https://github.com/wgwang/awesome-open-foundation-models**

Awesome family related to LLMS includes:

- https://github.com/wgwang/awesome-LLM-benchmarks

- https://github.com/wgwang/awesome-LLMs-In-China

- https://github.com/wgwang/awesome-open-foundation-models

大模型相关的Awesome系列包括：

- 大模型评测数据集：

  https://github.com/wgwang/awesome-LLM-benchmarks

- 中国大模型列表：

  https://github.com/wgwang/awesome-LLMs-In-China

- 开源开放基础大模型列表：

  https://github.com/wgwang/awesome-open-foundation-models

微信扫码关注我的微信公众号：**走向未来**，分享有关大模型、AGI、知识图谱、深度学习、强化学习、计算机视觉、自然语言处理等等与人工智能有关的内容。

![](imgs/走向未来.jpg)  

**Star一下，举手之劳！**

## 开源开放的基础大模型列表

- 国内开源大模型两大霸主

  - Qwen 

  - DeepSeeker

  - 两个都在杭州，杭州上大分！

- 国外开源大模型两大霸主

  - LLaMA 美国

  - Mistral 欧洲

|序号|名称|参数规模|数据规模|发布时间|说明|

|:-|:-|:-|:-|:-|:-|

|1|[LLaMA-2](Open-LLMs/llama2.md)|7B,13B,34B,70B|2T|2023-07-18|可商用|

|2|[Falcon](Open-LLMs/falcon.md)|7B,40B,180B|3.5T|2023-09-06|数据集[ RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb)|

|3|[baichuan-2](Open-LLMs/baichuan2.md)|7B,13B|2.6T|2023-09-06|[baichuan-1](Open-LLMs/baichuan.md)|

|4|[InternLM](Open-LLMs/internlm.md)|7B,20B|2.3T||开放，商用需授权|

|5|[BLOOM](Open-LLMs/bloom.md)|3B,7.1B,176B|366B||[详细介绍](https://mp.weixin.qq.com/s/ia-yrmXbnlooRA3K1hoTwQ)|

|6|GALACTICA|6.7B,30B,120B|106B||开放的科学文本和数据|

|7|[LLaMA-1](Open-LLMs/llama.md)|7B,13B,30B,65B|1.4T|2023-02-24|[详细介绍](https://mp.weixin.qq.com/s/dKInMi6P80GXecUtR3WQsA)|

|8|MOSS-moon|16B|700B||6.67x1022 FLOPs|

|9|[ChatGLM3](https://github.com/THUDM/ChatGLM3)|6B|1.4T|2023-10-25|

|10|StableLM|3B,7B|800B||

|11|RedPajama-INCITE|3B,7B|1T|||

|12|GPT-NeoX|20B|3.15M||800GB的[The Pile](https://arxiv.org/abs/2101.00027)数据集|

|13|OpenLLaMA|3B,7B,13B|1T|||

|14|MPT|7B,30B|1T||

|15|Pythia|2.8B,6.9B,12B|300B|||

|16|XGen|7B|1.5T|||

|17|OPT|6.7B,13B,30B,66B,175B|180B|||

|18|[Qwen-1](Open-LLMs/qwen.md)|7B,14B,72B|2.4T,3.0T,3.0T|||

|19|XVERSE|13B,65B|1.4T,2.6T|||

|20|[Aquila2](https://github.com/FlagAI-Open/Aquila2)|7B,34B|2T|||

|21|Prithvi||||IBM+NASA,地理空间，100M（图片）|

|22|[Skywork](Open-LLMs/skywork.md)|13B|3.2T|2023-10-22|昆仑万维·天工|

|23|[Deepseek Coder](https://github.com/deepseek-ai/DeepSeek-Coder)|1.3B,6.7B,33B|2T||87% code and 13% 中英文文本|

|24|Aquila|7B||2023-06-08|悟道·天鹰|

|25|Yi|6B,34B|3T|2023-11-04|零一万物|

|26|Mistral|7B|||欧洲|

|27|Yuan-2|2B,51B,102B|288B|2023-12-21|源|

|28|Mistral MoE|46.7B||2023-12-11|7BX8 MoE,12.9B/46.7B|

|29|BlueLM|7B|2.6T|2023-10-16|https://github.com/vivo-ai-lab/BlueLM|

|30|YAYI 2|30B|2.65T|2023-12-15|https://github.com/wenge-research/YAYI2|

|31|CodeShell|7B|0.5T|2023-09-22|https://github.com/WisdomShell/codeshell|

|32|[DeepSeek-V2](https://github.com/deepseek-ai/DeepSeek-V2)|236B|8.1T|2024-05-06|MoE 21B/236B，160Experts，2Activate|

|33|[LLaMA-3](./Open-LLMs/llama3.md)| 8B,70B |15T| 2024-04-18||

|34|[QWen-1.5](https://github.com/QwenLM/Qwen1.5)|0.5B,1.8B,4B,7B,14B,32B,72B|2.4T,2.4T,2.4T,4T,4T,3T,3T|2024-02-06||

|35|[QWen-1.5-110B](https://github.com/QwenLM/Qwen1.5)|110B|2.2T|2024-04-24||

|36|[DBRX](https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm)|132B|12T|2024-03-27|MOE 36B/132B, 4/16experts|

|37|[Gemma-1](https://huggingface.co/blog/gemma)|7B,2B|6T|2024-02-20|终端设备|

|38|[Yi-1.5](https://github.com/01-ai/Yi-1.5)|6B,9B,34B|3.5T|2024-05-13|零一万物，在Yi之上用了500B 语料增量训练得到|

|39|Yi-VL|6B,34B||2024-01-18|Clip ViT-H/14+Yi-6/34B-Chat,100M图文对|

|40|[Hunyuan-DiT](https://github.com/Tencent/HunyuanDiT)|1.5B||2024-05-13|腾讯混元文生图|

|41|[Command-R+]()|104B|4T|2024-04-03|Focus RAG|

|42|Mistral-8X22B|141B||2024-4-17|8X22B MOE, 39B/141B|

|43|[Arctic](https://github.com/Snowflake-Labs/snowflake-arctic)|480B|3.5T|2024-04-22|Dense(10B)-MoE(128X3.66B), 17B/480B，3阶段训练，1T+1.5T+1T tokens|

|44|[Grok-1](https://github.com/xai-org/grok-1)|314B||2024-03-17|MoE 8 Expert| 

|45|[InternLM2](https://github.com/InternLM/InternLM)|7B,20B|2.3T|2023-09-18|

|46|[stableLM-2]()|12B|2T|2024-03-21||

|47|[Nemotron-4]()|340B|9T|2024-06-14||

|48|[Qwen-2](./Open-LLMs/qwen2.md)|0.5B,1.5B,7B,57B,72B|7T|2024-05-23|57B是个MOE，64E选8的模型|

|49|Skywork-MoE|146B||2024-06-03|16EX13B，A:22B|

|50|Gemma-2|9B,27B|8T,13T|2024-06-27||

|51|Deepseek-Coder-V2|16B,236B|10.2T|2024-06-17|MoE|

|52|GLM-4|9B|10T|2024-06-05||

|53|LLaMA-3.1|8B,70B,405B|15T|2024-07-23||

|54|XVERSE-MoE|255.4B||2024-09-10||

|55|Pixtral|12B||2024-09-10|多模态|

## 非基础大模型

- WizardLM，WizardMath，WizardCoder

- Alpaca

- Vicuna

- Guanaco

- [CodeLLaMA](Open-LLMs/codellama.md)

  - 7B,13B,34B，基于LLaMA2，增加了650B左右的代码词元进行增量训练和微调

## 模型架构

- [GPTQ](https://github.com/IST-DASLab/gptq)

- [LLaMA](https://github.com/facebookresearch/llama)

# 微信公众号：走向未来

欢迎扫码关注微信公众：**走向未来**，公众号专注于分享AGI、大模型、知识图谱、深度学习、强化学习等技术、系统架构、应用场景和案例等内容。

![](imgs/the-land-of-future.jpeg)  

# 珠峰书

珠峰书《知识图谱：认知智能理论与实战》一书全面介绍了知识图谱的构建技术、存储技术和应用技术、Transformer、图神经网络等内容，欢迎购买。具体来说，包括：

- 知识图谱模式设计、知识抽取、图数据库、知识计算、知识推理、知识问答、知识推荐等全方面的内容

- 详细介绍了Transformer模型细节和实现方法，是大模型的基础技术

- 国内首本提到向量数据库的书籍

- 简要介绍了多模态知识融合的内容，书中以“月亮”为例，提出应当把图片的月亮、各种不同语言的文本月亮和和月亮的读音等，都应当融合到同一个知识点中。这正是多模态大模型所做的。

- 其他一些关于神经科学、脑科学和哲学中对智能的思考

  

![](imgs/kgbook.jpg)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/wgwang/awesome-open-foundation-models

Awesome Lists containing this project

README