https://github.com/kyegomez/exa

Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and minimal learning curve.
https://github.com/kyegomez/exa

inference-engine llama2 llama2-7b llamacpp llamas llm-inference llms opensource

Last synced: 2 months ago
JSON representation

Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and minimal learning curve.

Host: GitHub
URL: https://github.com/kyegomez/exa
Owner: kyegomez
License: mit
Created: 2023-09-05T13:16:25.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-11-11T09:03:27.000Z (11 months ago)
Last Synced: 2025-05-11T01:01:51.603Z (5 months ago)
Topics: inference-engine, llama2, llama2-7b, llamacpp, llamas, llm-inference, llms, opensource
Language: Python
Homepage: https://exa.apac.ai
Size: 2.44 MB
Stars: 26
Watchers: 2
Forks: 4
Open Issues: 1
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# Exa
Boost your GPU's LLM performance by 300% on everyday GPU hardware, as validated by renowned developers, in just 5 minutes of setup and with no additional hardware costs.

-----

## Principles
- Radical Simplicity (Utilizing super-powerful LLMs with as minimal lines of code as possible)
- Ultra-Optimizated Peformance (High Performance code that extract all the power from these LLMs)
- Fludity & Shapelessness (Plug in and play and re-architecture as you please)

---

## 📦 Install 📦
```bash
$ pip3 install exxa
```
-----

## Usage

## 🎉 Features 🎉

- **World-Class Quantization**: Get the most out of your models with top-tier performance and preserved accuracy! 🏋️‍♂️

- **Automated PEFT**: Simplify your workflow! Let our toolkit handle the optimizations. 🛠️

- **LoRA Configuration**: Dive into the potential of flexible LoRA configurations, a game-changer for performance! 🌌

- **Seamless Integration**: Designed to work seamlessly with popular models like LLAMA, Falcon, and more! 🤖

----

## 💌 Feedback & Contributions 💌

We're excited about the journey ahead and would love to have you with us! For feedback, suggestions, or contributions, feel free to open an issue or a pull request. Let's shape the future of fine-tuning together! 🌱

[Check out our project board for our current backlog and features we're implementing](https://github.com/users/kyegomez/projects/8/views/2)

# License
MIT

# Todo

- Setup utils logger classes for metric logging with useful metadata such as token inference per second, latency, memory consumption
- Add cuda c++ extensions for radically optimized classes for high performance quantization + inference on the edge

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyegomez/exa

Awesome Lists containing this project

README