https://github.com/edgenai/edgen

⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and many others.
https://github.com/edgenai/edgen

chatgpt edge genai llm localai ml openai rust tauri vertex-ai watson

Last synced: 7 months ago
JSON representation

⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and many others.

Host: GitHub
URL: https://github.com/edgenai/edgen
Owner: edgenai
License: apache-2.0
Created: 2024-01-30T17:40:42.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-05-23T14:21:38.000Z (over 1 year ago)
Last Synced: 2025-03-29T17:08:47.349Z (8 months ago)
Topics: chatgpt, edge, genai, llm, localai, ml, openai, rust, tauri, vertex-ai, watson
Language: Rust
Homepage: https://docs.edgen.co/
Size: 2.37 MB
Stars: 357
Watchers: 7
Forks: 16
Open Issues: 25
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-rust-list - edgenai/edgen - to-text (whisper) and many others. [docs.edgen.co/](https://docs.edgen.co/) (Machine Learning)

README

GitHub

A Local GenAI API Server: A drop-in replacement for OpenAI's API for Local GenAI

EdgenChat, a local chat app powered by ⚡Edgen

- [x] **OpenAI Compliant API**: ⚡Edgen implements an [OpenAI compatible API](https://docs.edgen.co/api-reference), making it a drop-in replacement.
- [x] **Multi-Endpoint Support**: ⚡Edgen exposes multiple AI endpoints such as chat completions (LLMs) and speech-to-text (Whisper) for audio transcriptions.
- [x] **Model Agnostic**: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and [many others](https://docs.edgen.co/documentation/models).
- [x] **Optimized Inference**: You don't need to take a PhD in AI optimization. ⚡Edgen abstracts the complexity of optimizing inference for different hardware, platforms and models.
- [x] **Modular**: ⚡Edgen is **model** and **runtime** agnostic. New models can be added easily and ⚡Edgen can select the best runtime for the user's hardware: you don't need to keep up about the latest models and ML runtimes - **⚡Edgen will do that for you**.
- [x] **Model Caching**: ⚡Edgen caches foundational models locally, so 1 model can power hundreds of different apps - users don't need to download the same model multiple times.
- [x] **Native**: ⚡Edgen is built in 🦀Rust and is natively compiled to all popular platforms: **Windows, MacOS and Linux**. No docker required.
- [ ] **Graphical Interface**: A graphical user interface to help users efficiently manage their models, endpoints and permissions.

⚡Edgen lets you use GenAI in your app, completely **locally** on your user's devices, for **free** and with **data-privacy**. It's a drop-in replacement for OpenAI (it uses the a compatible API), supports various functions like text generation, speech-to-text and works on Windows, Linux, and MacOS.

### Features

- [x] Session Caching: ⚡Edgen maintains top performance with big contexts (big chat histories), by caching sessions. Sessions are auto-detected in function of the chat history.
- [x] [GPU support](https://github.com/edgenai/edgen#gpu-support): CUDA, Vulkan. Metal

### Endpoints

- [x] \[Chat\] [Completions](https://docs.edgen.co/api-reference/chat)
- [x] \[Audio\] [Transcriptions](https://docs.edgen.co/api-reference/audio)
- [x] \[Embeddings\] [Embeddings](https://platform.openai.com/docs/api-reference/embeddings)
- [ ] \[Image\] Generation
- [ ] \[Chat\] Multimodal chat completions
- [ ] \[Audio\] Speech

### Supported Models

Check in the [documentation](https://docs.edgen.co/documentation/models)

### Supported platforms

- [x] Windows
- [x] Linux
- [x] MacOS

## 🔥 Hot Topics

## Why local GenAI?

- **Data Private**: On-device inference means **users' data** never leave their devices.

- **Scalable**: More and more users? No need to increment cloud computing infrastructure. Just let your users use their own hardware.

- **Reliable**: No internet, no downtime, no rate limits, no API keys.

- **Free**: It runs locally on hardware the user already owns.

## Quickstart

1. [Download](https://edgen.co/download) and start ⚡Edgen
2. Chat with ⚡[EdgenChat](https://chat.edgen.co)

Ready to start your own GenAI application? [Checkout our guides](https://docs.edgen.co/guides)!

⚡Edgen usage:

```
Usage: edgen [] []

Toplevel CLI commands and options. Subcommands are optional. If no command is provided "serve" will be invoked with default options.

Options:
--help display usage information

Commands:
serve Starts the edgen server. This is the default command when no
command is provided.
config Configuration-related subcommands.
version Prints the edgen version to stdout.
oasgen Generates the Edgen OpenAPI specification.
```

`edgen serve` usage:

```
Usage: edgen serve [-b ] [-g]

Starts the edgen server. This is the default command when no command is provided.

Options:
-b, --uri if present, one or more URIs/hosts to bind the server to.
`unix://` (on Linux), `http://`, and `ws://` are supported.
For use in scripts, it is recommended to explicitly add this
option to make your scripts future-proof.
-g, --nogui if present, edgen will not start the GUI; the default
behavior is to start the GUI.
--help display usage information
```

## GPU Support

⚡Edgen also supports compilation and execution on a GPU, when building from source, through Vulkan, CUDA and Metal.
The following cargo features enable the GPU:

- `llama_vulkan` - execute LLM models using Vulkan. Requires a Vulkan SDK to be installed.
- `llama_cuda` - execute LLM models using CUDA. Requires a CUDA Toolkit to be installed.
- `llama_metal` - execute LLM models using Metal.
- `whisper_cuda` - execute Whisper models using CUDA. Requires a CUDA Toolkit to be installed.

Note that, at the moment, `llama_vulkan`, `llama_cuda` and `llama_metal` cannot be enabled at the same time.

Example usage (building from source, [you need to first install the prerequisites](https://docs.edgen.co/documentation/getting-started)):

```
cargo run --features llama_vulkan --release -- serve
```

## Architecture Overview

⚡Edgen architecture overview

## Contribute

If you don't know where to start, check [Edgen's roadmap](https://github.com/orgs/edgenai/projects/1/views/1)!
Before you start working on something, see if there's an existing issue/pull-request. Pop into Discord to check with the team or see if someone's already tackling it.

## Communication Channels

- [Edgen Discord server](https://discord.gg/QUXbwqdMRs): Real time discussions with the ⚡Edgen team and other users.
- [GitHub issues](https://github.com/edgenai/edgen/issues): Feature requests, bugs.
- [GitHub discussions](https://github.com/edgenai/edgen/discussions/): Q&A.
- [Blog](https://blog.edgen.co): Big announcements.

## Special Thanks

- [`llama.cpp`](https://github.com/ggerganov/llama.cpp/tree/master),
[`whisper.cpp`](https://github.com/ggerganov/whisper.cpp), and [`ggml`](https://github.com/ggerganov/ggml) for being
an excellent getting-on point for this space.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/edgenai/edgen

Awesome Lists containing this project

README

A Local GenAI API Server: A drop-in replacement for OpenAI's API for Local GenAI