Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-llama-resources
just collections about Llama2
https://github.com/MIBlue119/awesome-llama-resources
Last synced: 5 days ago
JSON representation
-
Models
-
Demo
-
Porting
- Karpathy's Llama2.c
- web-llm - language models and chat to web browsers
- pyllama
- HuggingFace release Swift Transformers to help run LLM on Apple Device - transformers), a [swift chat app](https://github.com/huggingface/swift-chat) and a [exporters](https://github.com/huggingface/exporters) for exporting model to coreml.
-
Tutorial
-
For specific usage Model/ Finetuned model
- Chinese-Llama-2-7b
- Chinese-LLaMA-Alpaca
- ToolLLaMA
- Llama2-Code-Interpreter
- Llama2-Medical-Chatbot
- Finetune LLaMA 7B with Traditional Chinese instruction datasets
- Taiwan-LLaMa
- Finetuning LLaMa + Text-to-SQL - tune LLaMa 2 7B on a Text-to-SQL dataset
- Huggingface trend about llama2
- Finetuned on code with qLoRA
- Taiwan-LLaMa
-
Multimodal LLM
- LLaSM: Large Language and Speech Model
- LLaVA - and-Vision Assistant
- Chinese-LLaVA
-
Toolkits
- TogetherAI
- LLaMA2-Accessory - source Toolkit for LLM Development
- LLaMA-Adapter - tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
- text-generation-webui - J, OPT, and GALACTICA.
- text-generation-inference
- FlexFlow Serve: Low-Latency, High-Performance LLM Serving - source compiler and distributed system for low latency, high performance LLM serving.
-
Optimiztion(Latency/Size)
- GPTQ: Accurate Post Training Quantization for generative pre-trained transformers
- AutoGPTQ - to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
- Optimizing LLM latency
- Series Quantized LLama2 Model from The Bloke with GPTQ/GGML
- TheBloke/llama-2-7B-Guanaco-QLoRA-GPTQ
- OpenAssistant-Llama2-13B-Orca-8K-3319-GPTQ
- GPTQ: Accurate Post Training Quantization for generative pre-trained transformers
- Together AI's Medusa to accelerate decoding
- NVIDIA TensorRT-LLM Supercharges Large Language Model Inference on NVIDIA H100 GPUs - LLM is an open-source library that accelerates and optimizes inference performance on the latest LLMs on NVIDIA Tensor Core GPUs.
- 20231130 Pytorch Team use pytorch tool to accelerate
- AutoGPTQ - to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
- Together AI's Medusa to accelerate decoding
-
Optimization(Reasoning)
-
Other Resources
- LLaMA-efficient-tuning - to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA) (LLaMA-2, BLOOM, Falcon, Baichuan)
- awesome-llm and aigc
- LLaMA2-Every Resource you need
- Finetune Falcon-7B on Your GPU with TRL and QLoRA - 7b on your consumer GPU
- A Definitive Guide to QLoRA: Fine-tuning Falcon-7b with PEFT
- Amazon sagemaker generativeai: Fine-tune Falcon-40B with QLoRA
- Llama with FlashAttention2 - >40.3GiB
- Anti-hype LLM reading list
- Llama 2 の情報まとめ
- LLaMA-efficient-tuning - to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA) (LLaMA-2, BLOOM, Falcon, Baichuan)
- awesome-llm and aigc
-
Some theory
- LLMSurvey
- Stanford CS324 - Large Language Models
- Why we should train smaller LLMs on more tokens
- Open challenges in LLM research
- Challenges and Applications of Large Language Models
- Why you(Propbably) Don't Need to Fine-tune an LLM - shot prompting/ Retrieval Augmented Generation(RAG)
- CS221:Artificial Intelligence: Principles and Techniques
- Why you(Propbably) Don't Need to Fine-tune an LLM - shot prompting/ Retrieval Augmented Generation(RAG)
- Why you(Propbably) Don't Need to Fine-tune an LLM - shot prompting/ Retrieval Augmented Generation(RAG)
- Challenges and Applications of Large Language Models
-
Finetune Method/ Scripts
- Finetune with PEFT
- Finetune together.ai 32k context window model
- Llama-2-7B-32K-Instruct — and fine-tuning for Llama-2 models with Together API
- Finetune with QLora at 13b model
- HuggingFace SFT training script
- Pytorch-lightening's script to finetune Llama2 on custom dataset
- Instuction-tune Llama2
- Finetune LLaMA2 7-70B on Amazon SageMaker
- Finetune LLaMa2 with QLoRA at colab
- Fine-tune Llama 2 with DPO by huggingface
- Fine-tune Llama2 on specific usage like SQL Gen/Functional Representation - project/ray/tree/master/doc/source/templates/04_finetuning_llms_with_deepspeed)
-
Prompt
-
Use
- Run Llama 2 on your own Mac using LLM and Homebrew
- Deploy Llama2 7B/13B/70B model on AWS SageMaker - generation-inference). HuggingFace's text generation inference is a Rust, Python and gRPC server for text generation inference. Used in production at HuggingFace to power Hugging Chat, the Inference API and Inference Endpoint.
-
Move on to production
- Patterns for Building LLM-based Systems & Products
- Finetuning an LLM: RLHF and alternatives
- Github:A developer’s guide to prompt engineering and LLMs
- The Rise and Potential of Large Language Model Based Agents: A Survey - Agent-Paper-List
- The Rise and Potential of Large Language Model Based Agents: A Survey - Agent-Paper-List
-
Evaluation
-
Calculation
-
Some basics
Programming Languages
Categories
Optimiztion(Latency/Size)
12
For specific usage Model/ Finetuned model
11
Other Resources
11
Finetune Method/ Scripts
11
Some theory
10
Models
9
Tutorial
6
Toolkits
6
Move on to production
5
Calculation
4
Porting
4
Optimization(Reasoning)
3
Multimodal LLM
3
Demo
2
Use
2
Evaluation
2
Some basics
2
Prompt
1
Sub Categories
Keywords
llm
14
llama
9
large-language-models
7
pytorch
7
deep-learning
6
quantization
5
llama2
5
nlp
5
chatgpt
5
transformers
4
gpt
4
lora
4
instruction-tuning
4
ai
4
fine-tuning
3
inference
3
transformer
3
llms
3
chatglm
3
llama3
3
langchain
3
rlhf
3
language-model
3
finetuning
2
langauge-model
2
taiwan
2
traditional-mandarin
2
pre-trained-language-models
2
machine-learning
2
python
2
vllm
2
llama-2
2
alpaca
2
qwen
2
agent
2
qlora
2
peft
2
mistral
2
moe
2
alpaca-2
1
tvm
1
webgpu
1
llama2-docker
1
plm
1
webml
1
codeinterpreter
1
codellama
1
pre-training
1
natural-language-processing
1
in-context-learning
1