Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

awesome-mlvid

moment localization in videos
https://github.com/jiny419/awesome-mlvid

Attention Is All You Need
Improving Language Understanding by Generative Pre-Training
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Language Models are Unsupervised Multitask Learners
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
Scaling Laws for Neural Language Models
Language models are few-shot learners
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
Evaluating Large Language Models Trained on Code
Awesome-LLM-hallucination - LLM hallucination paper list.
Open LLM Leaderboard - aims to track, rank and evaluate LLMs and chatbots as they are released.
Chatbot Arena Leaderboard - a benchmark platform for large language models (LLMs) that features anonymous, randomized battles in a crowdsourced manner.
AlpacaEval Leaderboard - An Automatic Evaluator for Instruction-following Language Models
Open Ko-LLM Leaderboard - The Open Ko-LLM Leaderboard objectively evaluates the performance of Korean Large Language Model (LLM).
Yet Another LLM Leaderboard - Leaderboard made with LLM AutoEval using Nous benchmark suite.
OpenCompass 2.0 LLM Leaderboard - OpenCompass is an LLM evaluation platform, supporting a wide range of models (InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Paper - |
Paper - |
Paper - |
Paper - |
api - 08 | [Paper](https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf) | - |
api - 05 | [Paper](https://arxiv.org/pdf/2205.01068.pdf) | [OPT-175B License Agreement](https://github.com/facebookresearch/metaseq/blob/edefd4a00c24197486a3989abe28ca4eb3881e59/projects/OPT/MODEL_LICENSE.md) |
api - 11 | [Paper](https://arxiv.org/pdf/2211.05100.pdf) | [BigScience RAIL License v1.0](https://huggingface.co/spaces/bigscience/license) |
api - 05 | [Paper](https://arxiv.org/pdf/2005.14165.pdf) | - |
Paper - |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
api - 10 | [Paper](https://arxiv.org/pdf/2012.00413.pdf) | - |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
Paper - |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
api - 03 | [Paper](https://arxiv.org/pdf/2203.02155.pdf) | - |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
demo - 03|[Github](https://github.com/tatsu-lab/stanford_alpaca)| [CC BY NC 4.0](https://github.com/tatsu-lab/stanford_alpaca/blob/main/WEIGHT_DIFF_LICENSE) |
ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
Blog
demo - x9IvKno0A4sk30) | 2022-11 | [Blog](https://openai.com/blog/chatgpt/) |
Paper
demo - 03 | [Blog](https://www.anthropic.com/index/introducing-claude) |
survey paper
Gemma - Gemma is built for responsible AI development from the same research and technology used to create Gemini models.
Mistral - Mistral-7B-v0.1 is a small, yet powerful model adaptable to many use-cases including code and 8k sequence length. Apache 2.0 licence.
Mixtral 8x7B - a high-quality sparse mixture of experts model (SMoE) with open weights.
DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Megatron-DeepSpeed - DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.
Maarten Grootendorst
Jack Cook
UWaterloo
DeepLearning.AI
Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs - it comes with a [GitHub repository](https://github.com/benman1/generative_ai_with_langchain) that showcases a lot of the functionality
Build a Large Language Model (From Scratch) - A guide to building your own working LLM.
A Stage Review of Instruction Tuning - 06-29] [Yao Fu]
Large Language Models: A New Moore's Law - 10-26\]\[Huggingface\]
Arize-Phoenix - Open-source tool for ML observability that runs in your notebook environment. Monitor and fine tune LLM, CV and Tabular Models.
Emergent Mind - The latest AI news, curated & explained by GPT-4.

Programming Languages

Keywords

deep-learning 2 gpt-3 2 language-model 2 transformers 2 evaluation 1 foundation-models 1 instruction-following 1 large-language-models 1 leaderboard 1 nlp 1 rlhf 1 deepspeed-library 1 gpt 1 gpt-2 1 billion-parameters 1 compression 1 data-parallelism 1 gpu 1 inference 1 machine-learning 1 mixture-of-experts 1 model-parallelism 1 pipeline-parallelism 1 pytorch 1 trillion-parameters 1 zero 1