Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/edgenai/edgen
⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and many others.
https://github.com/edgenai/edgen
chatgpt edge genai llm localai ml openai rust tauri vertex-ai watson
Last synced: about 1 month ago
JSON representation
⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and many others.
- Host: GitHub
- URL: https://github.com/edgenai/edgen
- Owner: edgenai
- License: apache-2.0
- Created: 2024-01-30T17:40:42.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-04-19T19:50:06.000Z (7 months ago)
- Last Synced: 2024-04-23T00:13:24.938Z (7 months ago)
- Topics: chatgpt, edge, genai, llm, localai, ml, openai, rust, tauri, vertex-ai, watson
- Language: Rust
- Homepage: https://docs.edgen.co/
- Size: 2.22 MB
- Stars: 273
- Watchers: 6
- Forks: 13
- Open Issues: 25
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-rust-list - edgenai/edgen - to-text (whisper) and many others. [docs.edgen.co/](https://docs.edgen.co/) (Machine Learning)
- awesome-rust-list - edgenai/edgen - to-text (whisper) and many others. [docs.edgen.co/](https://docs.edgen.co/) (Machine Learning)
README
A Local GenAI API Server: A drop-in replacement for OpenAI's API for Local GenAI
|
Documentation |
Blog |
Discord |
Roadmap |- [x] **OpenAI Compliant API**: ⚡Edgen implements an [OpenAI compatible API](https://docs.edgen.co/api-reference), making it a drop-in replacement.
- [x] **Multi-Endpoint Support**: ⚡Edgen exposes multiple AI endpoints such as chat completions (LLMs) and speech-to-text (Whisper) for audio transcriptions.
- [x] **Model Agnostic**: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and [many others](https://docs.edgen.co/documentation/models).
- [x] **Optimized Inference**: You don't need to take a PhD in AI optimization. ⚡Edgen abstracts the complexity of optimizing inference for different hardware, platforms and models.
- [x] **Modular**: ⚡Edgen is **model** and **runtime** agnostic. New models can be added easily and ⚡Edgen can select the best runtime for the user's hardware: you don't need to keep up about the latest models and ML runtimes - **⚡Edgen will do that for you**.
- [x] **Model Caching**: ⚡Edgen caches foundational models locally, so 1 model can power hundreds of different apps - users don't need to download the same model multiple times.
- [x] **Native**: ⚡Edgen is built in 🦀Rust and is natively compiled to all popular platforms: **Windows, MacOS and Linux**. No docker required.
- [ ] **Graphical Interface**: A graphical user interface to help users efficiently manage their models, endpoints and permissions.⚡Edgen lets you use GenAI in your app, completely **locally** on your user's devices, for **free** and with **data-privacy**. It's a drop-in replacement for OpenAI (it uses the a compatible API), supports various functions like text generation, speech-to-text and works on Windows, Linux, and MacOS.
### Features
- [x] Session Caching: ⚡Edgen maintains top performance with big contexts (big chat histories), by caching sessions. Sessions are auto-detected in function of the chat history.
- [x] [GPU support](https://github.com/edgenai/edgen#gpu-support): CUDA, Vulkan. Metal### Endpoints
- [x] \[Chat\] [Completions](https://docs.edgen.co/api-reference/chat)
- [x] \[Audio\] [Transcriptions](https://docs.edgen.co/api-reference/audio)
- [x] \[Embeddings\] [Embeddings](https://platform.openai.com/docs/api-reference/embeddings)
- [ ] \[Image\] Generation
- [ ] \[Chat\] Multimodal chat completions
- [ ] \[Audio\] Speech### Supported Models
Check in the [documentation](https://docs.edgen.co/documentation/models)
### Supported platforms
- [x] Windows
- [x] Linux
- [x] MacOS## 🔥 Hot Topics
## Why local GenAI?
- **Data Private**: On-device inference means **users' data** never leave their devices.
- **Scalable**: More and more users? No need to increment cloud computing infrastructure. Just let your users use their own hardware.
- **Reliable**: No internet, no downtime, no rate limits, no API keys.
- **Free**: It runs locally on hardware the user already owns.
## Quickstart
1. [Download](https://edgen.co/download) and start ⚡Edgen
2. Chat with ⚡[EdgenChat](https://chat.edgen.co)Ready to start your own GenAI application? [Checkout our guides](https://docs.edgen.co/guides)!
⚡Edgen usage:
```
Usage: edgen [] []Toplevel CLI commands and options. Subcommands are optional. If no command is provided "serve" will be invoked with default options.
Options:
--help display usage informationCommands:
serve Starts the edgen server. This is the default command when no
command is provided.
config Configuration-related subcommands.
version Prints the edgen version to stdout.
oasgen Generates the Edgen OpenAPI specification.
````edgen serve` usage:
```
Usage: edgen serve [-b ] [-g]Starts the edgen server. This is the default command when no command is provided.
Options:
-b, --uri if present, one or more URIs/hosts to bind the server to.
`unix://` (on Linux), `http://`, and `ws://` are supported.
For use in scripts, it is recommended to explicitly add this
option to make your scripts future-proof.
-g, --nogui if present, edgen will not start the GUI; the default
behavior is to start the GUI.
--help display usage information
```## GPU Support
⚡Edgen also supports compilation and execution on a GPU, when building from source, through Vulkan, CUDA and Metal.
The following cargo features enable the GPU:- `llama_vulkan` - execute LLM models using Vulkan. Requires a Vulkan SDK to be installed.
- `llama_cuda` - execute LLM models using CUDA. Requires a CUDA Toolkit to be installed.
- `llama_metal` - execute LLM models using Metal.
- `whisper_cuda` - execute Whisper models using CUDA. Requires a CUDA Toolkit to be installed.Note that, at the moment, `llama_vulkan`, `llama_cuda` and `llama_metal` cannot be enabled at the same time.
Example usage (building from source, [you need to first install the prerequisites](https://docs.edgen.co/documentation/getting-started)):
```
cargo run --features llama_vulkan --release -- serve
```## Architecture Overview
⚡Edgen architecture overview
## Contribute
If you don't know where to start, check [Edgen's roadmap](https://github.com/orgs/edgenai/projects/1/views/1)!
Before you start working on something, see if there's an existing issue/pull-request. Pop into Discord to check with the team or see if someone's already tackling it.## Communication Channels
- [Edgen Discord server](https://discord.gg/QUXbwqdMRs): Real time discussions with the ⚡Edgen team and other users.
- [GitHub issues](https://github.com/edgenai/edgen/issues): Feature requests, bugs.
- [GitHub discussions](https://github.com/edgenai/edgen/discussions/): Q&A.
- [Blog](https://blog.edgen.co): Big announcements.## Special Thanks
- [`llama.cpp`](https://github.com/ggerganov/llama.cpp/tree/master),
[`whisper.cpp`](https://github.com/ggerganov/whisper.cpp), and [`ggml`](https://github.com/ggerganov/ggml) for being
an excellent getting-on point for this space.