Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/shouryan01/awesome-7b
Tracking the best open source llm models
https://github.com/shouryan01/awesome-7b
List: awesome-7b
Last synced: 24 days ago
JSON representation
Tracking the best open source llm models
- Host: GitHub
- URL: https://github.com/shouryan01/awesome-7b
- Owner: shouryan01
- License: mit
- Created: 2024-03-03T05:37:27.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-07-31T22:02:23.000Z (5 months ago)
- Last Synced: 2024-08-01T01:58:12.928Z (5 months ago)
- Homepage:
- Size: 7.81 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# awesome-7b
Tracking the best open-source llms that can *actually* run on consumer-level hardware.
# Best Foundational Models
## Tiny Models (recommended for mobile)
(model sizes <=6b)| Name | Size | Context Length | Weights | Ollama |
|---|---|---|---|---|
| Gemma 2 | 2B | 8k | [🤗 HF](https://huggingface.co/google/gemma-2-2b-it) | [Model](https://ollama.com/library/gemma2:2b)
| Phi-3.5 Mini | 3.8B | 128k | [🤗 HF](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) | [Model](https://ollama.com/library/phi3.5)## Small Models (recommended for desktop)
(model sizes 7b-10b)| Name | Size | Context Length | Weights | Ollama |
|---|---|---|---|---|
| Mistral v0.3 | 7.3B | 32k | [🤗 HF](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) | [Model](https://ollama.com/library/mistral:7b)
| Llama 3.1 | 8B | 128k | [🤗 HF](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) | [Model](https://ollama.com/library/llama3.1:8b)
| Gemma 2 | 9B | 8k | [🤗 HF](https://huggingface.co/google/gemma-2-9b-it) | [Model](https://ollama.com/library/gemma2:9b)## Medium Models
(model sizes 11b-16b)| Name | Size | Context Length | Weights | Ollama |
|---|---|---|---|---|
| Mistral NeMo | 12B | 128k | [🤗 HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) | [Model](https://ollama.com/library/mistral-nemo)## Large Models
(model sizes 17b-33b)| Name | Size | Context Length | Weights | Ollama |
|---|---|---|---|---|
| Gemma 2 | 27B | 8k | [🤗 HF](https://huggingface.co/google/gemma-2-27b-it) | [Model](https://ollama.com/library/gemma2:27b)## Huge Models (not for GPU-poor)
(model sizes >33b)
| Name | Size | Context Length | Weights | Ollama |
|---|---|---|---|---|
| Llama 3.1 | 70B | 128k | [🤗 HF](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B) | [Model](https://ollama.com/library/llama3.1:70b)
# Best Finetunes (accepting PRs)
| Name | Size | Context Length | Weights | Ollama |
|---|---|---|---|---|## FAQ
### What makes a model "good"?
I pretty much only trust two LLM evals right now: Chatbot Arena and r/LocalLlama comments section
— Andrej Karpathy (@karpathy) December 20, 2023[r/LocalLLaMA](https://www.reddit.com/r/LocalLLaMA) discussions and [Chatbot Arena](https://chat.lmsys.org/)
Don't trust benchmarks. Download the model and decide for yourself. Models are free so there's no downside lol.
### What's Ollama and why should I use it?
[Ollama](https://ollama.com/) just makes using LLMs that much more accessible for beginners. It's built on top of [llama.cpp](https://github.com/ggerganov/llama.cpp) and is [FOSS](https://github.com/ollama/ollama).### Which model should I use?
This one depends on *you*. How powerful is your hardare? How complicated is your use-case? A **small** model should is a good starting point. Then, if you need more speed you look at smaller models; if you need more power you turn to larger ones.### What does "size" mean?
It means how powerful the model is, how large the download size will be, and how much system resources you will need. According to [Ollama](https://github.com/ollama/ollama#:~:text=You%20should%20have%20at%20least%208%20GB%20of%20RAM%20available%20to%20run%20the%207B%20models%2C%2016%20GB%20to%20run%20the%2013B%20models%2C%20and%2032%20GB%20to%20run%20the%2033B%20models.): You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.### What does "context length" mean?
It refers to how long your prompts to the LLM can be. Higher is better (but may be slower and lower quality).### Which quant should I use?
If you don't know what `quant` means, stick to defaults. If you do, the largest your computer can handle. For most people, that's `q5_K_M`. Ollama's default is `q4_0`.