Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-mlvid
moment localization in videos
https://github.com/jiny419/awesome-mlvid
- Attention Is All You Need
- Improving Language Understanding by Generative Pre-Training
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Language Models are Unsupervised Multitask Learners
- Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
- Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
- ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
- Scaling Laws for Neural Language Models
- Language models are few-shot learners
- Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
- Evaluating Large Language Models Trained on Code
- Awesome-LLM-hallucination - LLM hallucination paper list.
- Open LLM Leaderboard - aims to track, rank and evaluate LLMs and chatbots as they are released.
- Chatbot Arena Leaderboard - a benchmark platform for large language models (LLMs) that features anonymous, randomized battles in a crowdsourced manner.
- AlpacaEval Leaderboard - An Automatic Evaluator for Instruction-following Language Models
- Open Ko-LLM Leaderboard - The Open Ko-LLM Leaderboard objectively evaluates the performance of Korean Large Language Model (LLM).
- Yet Another LLM Leaderboard - Leaderboard made with LLM AutoEval using Nous benchmark suite.
- OpenCompass 2.0 LLM Leaderboard - OpenCompass is an LLM evaluation platform, supporting a wide range of models (InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
- Paper - |
- Paper - |
- Paper - |
- Paper - |
- api - 08 | [Paper](https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf) | - |
- api - 05 | [Paper](https://arxiv.org/pdf/2205.01068.pdf) | [OPT-175B License Agreement](https://github.com/facebookresearch/metaseq/blob/edefd4a00c24197486a3989abe28ca4eb3881e59/projects/OPT/MODEL_LICENSE.md) |
- api - 11 | [Paper](https://arxiv.org/pdf/2211.05100.pdf) | [BigScience RAIL License v1.0](https://huggingface.co/spaces/bigscience/license) |
- api - 05 | [Paper](https://arxiv.org/pdf/2005.14165.pdf) | - |
- Paper - |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- api - 10 | [Paper](https://arxiv.org/pdf/2012.00413.pdf) | - |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- Paper - |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- api - 03 | [Paper](https://arxiv.org/pdf/2203.02155.pdf) | - |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- demo - 03|[Github](https://github.com/tatsu-lab/stanford_alpaca)| [CC BY NC 4.0](https://github.com/tatsu-lab/stanford_alpaca/blob/main/WEIGHT_DIFF_LICENSE) |
- ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
- Blog
- demo - x9IvKno0A4sk30) | 2022-11 | [Blog](https://openai.com/blog/chatgpt/) |
- Paper
- demo - 03 | [Blog](https://www.anthropic.com/index/introducing-claude) |
- survey paper
- Gemma - Gemma is built for responsible AI development from the same research and technology used to create Gemini models.
- Mistral - Mistral-7B-v0.1 is a small, yet powerful model adaptable to many use-cases including code and 8k sequence length. Apache 2.0 licence.
- Mixtral 8x7B - a high-quality sparse mixture of experts model (SMoE) with open weights.
- DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
- Megatron-DeepSpeed - DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.
- Maarten Grootendorst
- Jack Cook
- UWaterloo
- DeepLearning.AI
- Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs - it comes with a [GitHub repository](https://github.com/benman1/generative_ai_with_langchain) that showcases a lot of the functionality
- Build a Large Language Model (From Scratch) - A guide to building your own working LLM.
- A Stage Review of Instruction Tuning - 06-29] [Yao Fu]
- Large Language Models: A New Moore's Law - 10-26\]\[Huggingface\]
- Arize-Phoenix - Open-source tool for ML observability that runs in your notebook environment. Monitor and fine tune LLM, CV and Tabular Models.
- Emergent Mind - The latest AI news, curated & explained by GPT-4.
Programming Languages
Keywords
deep-learning
2
gpt-3
2
language-model
2
transformers
2
evaluation
1
foundation-models
1
instruction-following
1
large-language-models
1
leaderboard
1
nlp
1
rlhf
1
deepspeed-library
1
gpt
1
gpt-2
1
billion-parameters
1
compression
1
data-parallelism
1
gpu
1
inference
1
machine-learning
1
mixture-of-experts
1
model-parallelism
1
pipeline-parallelism
1
pytorch
1
trillion-parameters
1
zero
1