Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

awesome-mlvid

moment localization in videos
https://github.com/jiny419/awesome-mlvid

  • Attention Is All You Need
  • Improving Language Understanding by Generative Pre-Training
  • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
  • Language Models are Unsupervised Multitask Learners
  • Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
  • Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
  • ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
  • Scaling Laws for Neural Language Models
  • Language models are few-shot learners
  • Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
  • Evaluating Large Language Models Trained on Code
  • Awesome-LLM-hallucination - LLM hallucination paper list.
  • Open LLM Leaderboard - aims to track, rank and evaluate LLMs and chatbots as they are released.
  • Chatbot Arena Leaderboard - a benchmark platform for large language models (LLMs) that features anonymous, randomized battles in a crowdsourced manner.
  • AlpacaEval Leaderboard - An Automatic Evaluator for Instruction-following Language Models
  • Open Ko-LLM Leaderboard - The Open Ko-LLM Leaderboard objectively evaluates the performance of Korean Large Language Model (LLM).
  • Yet Another LLM Leaderboard - Leaderboard made with LLM AutoEval using Nous benchmark suite.
  • OpenCompass 2.0 LLM Leaderboard - OpenCompass is an LLM evaluation platform, supporting a wide range of models (InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
  • Paper - |
  • Paper - |
  • Paper - |
  • Paper - |
  • api - 08 | [Paper](https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf) | - |
  • api - 05 | [Paper](https://arxiv.org/pdf/2205.01068.pdf) | [OPT-175B License Agreement](https://github.com/facebookresearch/metaseq/blob/edefd4a00c24197486a3989abe28ca4eb3881e59/projects/OPT/MODEL_LICENSE.md) |
  • api - 11 | [Paper](https://arxiv.org/pdf/2211.05100.pdf) | [BigScience RAIL License v1.0](https://huggingface.co/spaces/bigscience/license) |
  • api - 05 | [Paper](https://arxiv.org/pdf/2005.14165.pdf) | - |
  • Paper - |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • api - 10 | [Paper](https://arxiv.org/pdf/2012.00413.pdf) | - |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • Paper - |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • api - 03 | [Paper](https://arxiv.org/pdf/2203.02155.pdf) | - |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • demo - 03|[Github](https://github.com/tatsu-lab/stanford_alpaca)| [CC BY NC 4.0](https://github.com/tatsu-lab/stanford_alpaca/blob/main/WEIGHT_DIFF_LICENSE) |
  • ckpt - 10 | [Paper](https://arxiv.org/pdf/2210.02414.pdf) | [The GLM-130B License](https://github.com/THUDM/GLM-130B/blob/799837802264eb9577eb9ae12cd4bad0f355d7d6/MODEL_LICENSE) |
  • Blog
  • demo - x9IvKno0A4sk30) | 2022-11 | [Blog](https://openai.com/blog/chatgpt/) |
  • Paper
  • demo - 03 | [Blog](https://www.anthropic.com/index/introducing-claude) |
  • survey paper
  • Gemma - Gemma is built for responsible AI development from the same research and technology used to create Gemini models.
  • Mistral - Mistral-7B-v0.1 is a small, yet powerful model adaptable to many use-cases including code and 8k sequence length. Apache 2.0 licence.
  • Mixtral 8x7B - a high-quality sparse mixture of experts model (SMoE) with open weights.
  • DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
  • Megatron-DeepSpeed - DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.
  • Maarten Grootendorst
  • Jack Cook
  • UWaterloo
  • DeepLearning.AI
  • Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs - it comes with a [GitHub repository](https://github.com/benman1/generative_ai_with_langchain) that showcases a lot of the functionality
  • Build a Large Language Model (From Scratch) - A guide to building your own working LLM.
  • A Stage Review of Instruction Tuning - 06-29] [Yao Fu]
  • Large Language Models: A New Moore's Law - 10-26\]\[Huggingface\]
  • Arize-Phoenix - Open-source tool for ML observability that runs in your notebook environment. Monitor and fine tune LLM, CV and Tabular Models.
  • Emergent Mind - The latest AI news, curated & explained by GPT-4.