Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-llama-resources
just collections about Llama2
https://github.com/MIBlue119/awesome-llama-resources
Last synced: 1 day ago
JSON representation
-
Models
-
Demo
-
Porting
- Karpathy's Llama2.c
- web-llm - language models and chat to web browsers
- pyllama
- HuggingFace release Swift Transformers to help run LLM on Apple Device - transformers), a [swift chat app](https://github.com/huggingface/swift-chat) and a [exporters](https://github.com/huggingface/exporters) for exporting model to coreml.
-
Tutorial
-
For specific usage Model/ Finetuned model
- Chinese-Llama-2-7b
- Chinese-LLaMA-Alpaca
- ToolLLaMA
- Llama2-Code-Interpreter
- Llama2-Medical-Chatbot
- Finetune LLaMA 7B with Traditional Chinese instruction datasets
- Taiwan-LLaMa
- Finetuning LLaMa + Text-to-SQL - tune LLaMa 2 7B on a Text-to-SQL dataset
- Huggingface trend about llama2
- Finetuned on code with qLoRA
-
Multimodal LLM
- LLaSM: Large Language and Speech Model
- LLaVA - and-Vision Assistant
- Chinese-LLaVA
-
Toolkits
- TogetherAI
- LLaMA2-Accessory - source Toolkit for LLM Development
- LLaMA-Adapter - tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
- text-generation-webui - J, OPT, and GALACTICA.
- text-generation-inference
- FlexFlow Serve: Low-Latency, High-Performance LLM Serving - source compiler and distributed system for low latency, high performance LLM serving.
-
Optimiztion(Latency/Size)
- GPTQ: Accurate Post Training Quantization for generative pre-trained transformers
- AutoGPTQ - to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
- Optimizing LLM latency
- Series Quantized LLama2 Model from The Bloke with GPTQ/GGML
- TheBloke/llama-2-7B-Guanaco-QLoRA-GPTQ
- OpenAssistant-Llama2-13B-Orca-8K-3319-GPTQ
- GPTQ: Accurate Post Training Quantization for generative pre-trained transformers
- Together AI's Medusa to accelerate decoding
- NVIDIA TensorRT-LLM Supercharges Large Language Model Inference on NVIDIA H100 GPUs - LLM is an open-source library that accelerates and optimizes inference performance on the latest LLMs on NVIDIA Tensor Core GPUs.
- 20231130 Pytorch Team use pytorch tool to accelerate
- AutoGPTQ - to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
- Together AI's Medusa to accelerate decoding
-
Optimization(Reasoning)
-
Other Resources
- LLaMA-efficient-tuning - to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA) (LLaMA-2, BLOOM, Falcon, Baichuan)
- awesome-llm and aigc
- LLaMA2-Every Resource you need
- Finetune Falcon-7B on Your GPU with TRL and QLoRA - 7b on your consumer GPU
- A Definitive Guide to QLoRA: Fine-tuning Falcon-7b with PEFT
- Amazon sagemaker generativeai: Fine-tune Falcon-40B with QLoRA
- Llama with FlashAttention2 - >40.3GiB
- Anti-hype LLM reading list
- Llama 2 の情報まとめ
-
Some theory
- LLMSurvey
- Stanford CS324 - Large Language Models
- Why we should train smaller LLMs on more tokens
- Open challenges in LLM research
- Challenges and Applications of Large Language Models
- Why you(Propbably) Don't Need to Fine-tune an LLM - shot prompting/ Retrieval Augmented Generation(RAG)
- CS221:Artificial Intelligence: Principles and Techniques
- Why you(Propbably) Don't Need to Fine-tune an LLM - shot prompting/ Retrieval Augmented Generation(RAG)
- Why you(Propbably) Don't Need to Fine-tune an LLM - shot prompting/ Retrieval Augmented Generation(RAG)
- Challenges and Applications of Large Language Models
-
Finetune Method/ Scripts
- Finetune with PEFT
- Finetune together.ai 32k context window model
- Llama-2-7B-32K-Instruct — and fine-tuning for Llama-2 models with Together API
- Finetune with QLora at 13b model
- HuggingFace SFT training script
- Pytorch-lightening's script to finetune Llama2 on custom dataset
- Instuction-tune Llama2
- Finetune LLaMA2 7-70B on Amazon SageMaker
- Finetune LLaMa2 with QLoRA at colab
- Fine-tune Llama 2 with DPO by huggingface
- Fine-tune Llama2 on specific usage like SQL Gen/Functional Representation - project/ray/tree/master/doc/source/templates/04_finetuning_llms_with_deepspeed)
-
Prompt
-
Use
- Run Llama 2 on your own Mac using LLM and Homebrew
- Deploy Llama2 7B/13B/70B model on AWS SageMaker - generation-inference). HuggingFace's text generation inference is a Rust, Python and gRPC server for text generation inference. Used in production at HuggingFace to power Hugging Chat, the Inference API and Inference Endpoint.
-
Move on to production
- Patterns for Building LLM-based Systems & Products
- Finetuning an LLM: RLHF and alternatives
- Github:A developer’s guide to prompt engineering and LLMs
- The Rise and Potential of Large Language Model Based Agents: A Survey - Agent-Paper-List
- The Rise and Potential of Large Language Model Based Agents: A Survey - Agent-Paper-List
-
Evaluation
-
Calculation
-
Some basics
Programming Languages
Categories
Optimiztion(Latency/Size)
12
Finetune Method/ Scripts
11
For specific usage Model/ Finetuned model
10
Models
10
Some theory
10
Other Resources
9
Toolkits
6
Tutorial
5
Move on to production
5
Calculation
4
Porting
4
Multimodal LLM
3
Demo
2
Use
2
Optimization(Reasoning)
2
Evaluation
2
Some basics
2
Prompt
1
Sub Categories
Keywords
llm
11
llama
7
large-language-models
5
deep-learning
5
chatgpt
5
pytorch
5
llama2
4
nlp
4
gpt
3
quantization
3
lora
3
instruction-tuning
3
pre-trained-language-models
2
inference
2
fine-tuning
2
transformer
2
llms
2
transformers
2
chatglm
2
llama3
2
llama-2
2
alpaca
2
rlhf
2
ai
2
language-model
2
langchain
2
traditional-mandarin
1
taiwan
1
langauge-model
1
instruction-following
1
finetuning
1
codellama
1
codeinterpreter
1
machine-learning
1
plm
1
python
1
vllm
1
tvm
1
webgpu
1
webml
1
llama2-docker
1
alpaca-2
1
qwen
1
aigc
1
awesome-list
1
computer-vision
1
datasets
1
hugging-face
1
internlm
1
jobs
1