https://github.com/xbrasil/myaiplayground

An open-source local application for chatting with Gemma 4 AI models running on your machine in a modern web UI.
https://github.com/xbrasil/myaiplayground
ai fastapi gemma gemma4 llama-cpp llamacpp llm local-ai multimodal multimodal-ai open-source privacy python react typescript whisper
Last synced: 16 days ago
JSON representation
An open-source local application for chatting with Gemma 4 AI models running on your machine in a modern web UI.
Host: GitHub
URL: https://github.com/xbrasil/myaiplayground
Owner: xBrasil
License: apache-2.0
Created: 2026-04-08T20:56:06.000Z (3 months ago)
Default Branch: master
Last Pushed: 2026-06-13T16:47:52.000Z (16 days ago)
Last Synced: 2026-06-13T17:06:11.236Z (16 days ago)
Topics: ai, fastapi, gemma, gemma4, llama-cpp, llamacpp, llm, local-ai, multimodal, multimodal-ai, open-source, privacy, python, react, typescript, whisper
Language: Python
Homepage:
Size: 924 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          


  



My AI Playground




  Aplicação desktop-local para conversas com modelos Gemma 4 rodando inteiramente na sua máquina.


  Interface web moderna, entrada multimodal (texto, imagens, áudio, arquivos) e histórico salvo apenas localmente.





![Stack](https://img.shields.io/badge/React_19-282c34?logo=react) ![Stack](https://img.shields.io/badge/FastAPI-009688?logo=fastapi&logoColor=white) ![Stack](https://img.shields.io/badge/llama.cpp-GGUF-blue) ![Stack](https://img.shields.io/badge/SQLite-003B57?logo=sqlite&logoColor=white)



---

## Funcionalidades

| Categoria 
| --------------------- 
| **Chat multimodal** 
| **Modelos Gemma 4** 
| **Arquivos de texto** 
| **Documentos** 
| **Pesquisa na web** 
| **Acesso a arquivos locais** 
| **Visão de imagens locais** 
| **Tool calling** 
| **Instruções 
| **Inferência local** 
| **Streaming** 
| **Auto-continuação** 
| **Gravação de voz** 
| **Transcrição** 
| **Leitura de respostas** 
| **Markdown rico** 
| **Edição de mensagens** 
| **Pesquisa de mensagens** 
| **Localização** 
| **Avaliação de risco** 
| **i18n** 
| **Tema escuro** 
| **Janela deslizante** 
| **Privacidade total** 
| **Ícone na bandeja**

| Descrição                                                                                                                                                                  | ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Texto, imagens, áudio e arquivos em uma única conversa — envio simultâneo de múltiplos arquivos; arquivos de áudio enviados manualmente entram como anexos de contexto, com extração automática da faixa em contêineres como MP4; imagens em PNG, JPEG, WebP, GIF, SVG, HEIC/HEIF, AVIF, BMP, ICO e TIFF | | E2B, E4B, 12B Unified e 26B-A4B via GGUF — troque de modelo a qualquer momento pela interface                                                                                | | 60+ extensões de código e dados (`.py`, `.ts`, `.json`, `.csv`, `.xml`, `.yaml`, `.sql`, `.rs`, `.go`…) lidos como texto                                | | PDF, Word (`.docx`), Excel (`.xlsx`) e PowerPoint (`.pptx`) — extração de texto automática                                                                         | | Busca via DuckDuckGo e leitura de páginas — o modelo cita fontes com referências numeradas `[1]`, `[2]`…                                                             | | O modelo pode listar e ler arquivos de pastas permitidas pelo usuário (somente leitura)                                                                                     | | Em modelos com visão (todos os modelos Gemma 4 disponíveis), o modelo pode ver e descrever imagens de pastas permitidas                                                     | | O modelo pode chamar ferramentas (web, filesystem, visão) automaticamente; chamadas ficam salvas no histórico e são exibidas de forma auditável                          | personalizadas** | System prompt customizável pelo usuário nos Ajustes — aplicado a todas as conversas                                                                                       | | llama.cpp server com aceleração GPU (NVIDIA CUDA, AMD HIP/ROCm, Apple Metal) ou CPU, flash attention, contexto por modelo (128K–256K tokens)                              | | Respostas exibidas token a token em tempo real                                                                                                                               | | Respostas longas continuam automaticamente quando o limite de tokens é atingido (até 5 rodadas)                                                                            | | Gravar, pausar, retomar e parar antes de enviar; áudio convertido para WAV 16 kHz; timer com contagem regressiva                                                            | | Whisper (faster-whisper) converte áudio em texto localmente quando o modelo ativo não oferece áudio nativo ou quando o engine não suporta essa entrada                      | | Text-to-Speech via Web Speech API com preferência para vozes Microsoft                                                                                                      | | Renderização com GFM, blocos de código com syntax highlight, matemática KaTeX                                                                                            | | Editar mensagens enviadas e regenerar respostas                                                                                                                              | | Busque conversas por título ou conteúdo — destaque automático dos termos encontrados                                                                                     | | Compartilhamento opcional de geolocalização para respostas mais contextualizadas (desativado por padrão)                                                                  | | Instruções personalizadas são avaliadas automaticamente pelo LLM; alerta exibido apenas quando o risco é significativo                                                   | | Português (BR), English (US), Español e Français — detecta automaticamente o idioma do navegador                                                                         | | UI minimalista e responsiva com design dark-mode                                                                                                                             | | Gestão automática de contexto: truncamento de conteúdo longo, descarte de mensagens antigas e retry em estouro                                                            | | Conversas e arquivos ficam em `data/user/` no seu disco. Nada é enviado para a nuvem.                                                                                     | | Em vez de uma janela de console, a aplicação roda como ícone na bandeja do sistema (system tray) com menu para abrir, reiniciar, ver logs e encerrar                      |

---

## Interface



  


  Interface principal com tema escuro — conversa com o modelo Gemma 4 E4B








  


  Envio de imagem com análise visual pelo modelo








  


  Resposta em tempo real — tokens aparecem conforme são gerados








  


  Seletor de modelos com descrições de capacidade e limitações








  


  Painel de ajustes — idioma, voz, instruções personalizadas, acesso web, arquivos locais








  


  Pesquisa na web com citação de fontes numeradas



---

## Requisitos de Sistema

O My AI Playground roda modelos de IA localmente no seu hardware. Os requisitos variam conforme o modelo escolhido.

### Mínimos (modelo Gemma 4 E2B — 2B parâmetros)

| Componente           | Requisito                                                   |

| -------------------- | ----------------------------------------------------------- |

| **SO**         | Windows 10/11 (64 bits)                                     |

| **RAM**        | 8 GB                                                        |

| **VRAM (GPU)** | 4 GB (NVIDIA CUDA, AMD HIP/ROCm ou Apple Metal) ou modo CPU |

| **Disco**      | ~10 GB para o modelo + 2 GB para dependências              |

| **CPU**        | Qualquer x86-64 com suporte AVX2 (ou Apple Silicon arm64)   |

### Intermediário (modelo padrão Gemma 4 E4B — 4B parâmetros)

| Componente           | Requisito                                       |

| -------------------- | ----------------------------------------------- |

| **RAM**        | 16 GB                                           |

| **VRAM (GPU)** | 7 GB (NVIDIA CUDA, AMD HIP/ROCm ou Apple Metal) |

| **Disco**      | ~10 GB para o modelo + 2 GB para dependências  |

### Recomendados (modelo Gemma 4 12B Unified — 12B parâmetros)

| Componente           | Requisito                                        |

| -------------------- | ------------------------------------------------ |

| **RAM**        | 24 GB                                            |

| **VRAM (GPU)** | 12 GB (NVIDIA CUDA, AMD HIP/ROCm ou Apple Metal) |

| **Disco**      | ~10 GB para o modelo + 2 GB para dependências     |

### Para o modelo maior (Gemma 4 26B-A4B — 26B parâmetros, MoE)

| Componente           | Requisito                                         |

| -------------------- | ------------------------------------------------- |

| **RAM**        | 32 GB                                             |

| **VRAM (GPU)** | 16 GB+ (NVIDIA CUDA, AMD HIP/ROCm ou Apple Metal) |

| **Disco**      | ~26 GB para o modelo + 2 GB para dependências    |

> **Nota:** sem VRAM suficiente, o llama.cpp fará offloading para a RAM do sistema (modo CPU/parcial), resultando em inferência significativamente mais lenta. Se você receber erros de **Out of Memory (OOM)**, experimente um modelo menor ou reduza `N_CTX` no arquivo `data/system/.env`.

---

## Início rápido (Windows)

### Opção A — Instalador gráfico

Na [página de releases](https://github.com/xBrasil/myAIplayground/releases) está disponível um instalador `.exe` para Windows (criado com [Inno Setup](https://jrsoftware.org/isinfo.php)). O assistente de instalação copia os arquivos, cria atalhos no Menu Iniciar e na Área de Trabalho, e opcionalmente executa a configuração de dependências ao final.

### Opção B — Via scripts

#### Pré-requisitos

- **Windows 10/11** (64 bits)

- **Python 3.11+**

- **Node.js 20+**

- **GPU** com drivers atualizados — NVIDIA (CUDA), AMD (HIP no Windows, ROCm no Linux) ou Apple Silicon (Metal) — recomendado; funciona sem GPU em modo CPU

#### Instalação

```powershell

install.cmd

```

O instalador:

- Detecta e instala Python e Node.js automaticamente via `winget` (solicitando elevação UAC se necessário)

- Cria o ambiente virtual `.venv` e instala dependências do backend

- Instala dependências npm do frontend

- Baixa o binário mais recente do `llama-server` (CUDA, HIP, ROCm, Metal ou CPU, conforme sua GPU)

- Cria `data/system/.env` a partir de `backend/.env.example`

#### Execução

```powershell

tray.cmd

```

O launcher:

- Inicia backend (FastAPI na porta 8000) e frontend (Vite na porta 5173) em segundo plano

- Exibe um ícone na bandeja do sistema (system tray) com menu: **Abrir no Navegador**, **Ver Logs**, **Reiniciar**, **Sair**

- Aguarda os serviços ficarem prontos e abre a interface no navegador automaticamente

- Logs salvos em `data/system/logs/backend.log` e `data/system/logs/frontend.log`

> **Modo diagnóstico:** para ver o log completo no console (útil para depuração), use `run.cmd` em vez de `tray.cmd`.

> **Dica:** o primeiro uso de cada modelo envolve download do GGUF do Hugging Face. Modelos ficam em cache em `data/system/model-cache/`.

---

## Início rápido (Linux / macOS)

### Pré-requisitos

- **Python 3.11+** com `venv` (`python3-venv` no Ubuntu/Debian)

- **Node.js 20+**

- **GPU** com drivers atualizados — NVIDIA (CUDA), AMD (ROCm) ou Apple Silicon (Metal) — recomendado; funciona sem GPU em modo CPU

- `curl`, `unzip` e `tar` instalados

### Instalação

```bash

chmod +x install.sh

./install.sh

```

O instalador faz as mesmas etapas da versão Windows: cria `.venv`, instala dependências, baixa `llama-server` e prepara `.env`.

### Execução

```bash

./tray.sh

```

Inicia os serviços em segundo plano e exibe um ícone na bandeja do sistema com menu de controle. O navegador é aberto automaticamente quando os serviços ficam prontos.

> **Modo diagnóstico:** para ver o log no terminal, use `./run.sh` em vez de `./tray.sh`. Use `Ctrl+C` para encerrar.

> **Nota:** o script detecta automaticamente macOS (arm64/x64) e Linux para baixar o binário correto do llama-server.

---

## Modelos disponíveis

| Modelo                    | Arquivo GGUF                          | Quantização | Contexto | Uso típico                              |

| ------------------------- | ------------------------------------- | ------------- | -------- | ---------------------------------------- |

| **Gemma 4 E2B**     | `gemma-4-E2B-it-Q8_0.gguf`          | Q8_0          | 128K     | Rápido, ideal para testes               |

| **Gemma 4 E4B**     | `gemma-4-E4B-it-Q4_K_M.gguf`        | Q4_K_M        | 128K     | Equilíbrio entre qualidade e velocidade |

| **Gemma 4 12B Unified** | `gemma-4-12b-it-qat-q4_0.gguf`  | QAT Q4_0      | 256K     | Alta capacidade, áudio nativo           |

| **Gemma 4 26B-A4B** | `gemma-4-26B-A4B-it-UD-IQ4_XS.gguf` | IQ4_XS        | 256K     | Maior qualidade, requer mais VRAM        |

O modelo E2B é selecionado por padrão caso o usuário não disponha de uma GPU, e o E4B caso disponha. E2B, E4B e 12B Unified processam áudio nativamente; no GGUF do 12B, o llama.cpp habilita mídia com o `mmproj-gemma-4-12b-it-qat-q4_0.gguf`. O 26B-A4B não tem áudio nativo e usa fallback por transcrição Whisper. Todos são executados pelo llama.cpp via GGUF, sem PyTorch em runtime.

> **Nota:** o `mmproj` do 12B Unified usa o projector `gemma4uv`, suportado pelo `llama-server` b9616 ou mais recente. Execute `install.cmd`/`install.sh` novamente se o app tiver sido instalado com uma build anterior.

---

## Stack técnica

### Frontend

- **React 19** + **TypeScript** + **Vite**

- `react-markdown` + `remark-gfm` + `remark-math` + `rehype-katex`

- Web Speech API (TTS)

- MediaRecorder API (gravação de áudio)

### Backend

- **FastAPI** + **Uvicorn**

- **SQLAlchemy** (SQLite)

- **faster-whisper** (transcrição de áudio)

- **huggingface_hub** (download de modelos)

- **httpx** (comunicação com llama-server)

- **duckduckgo-search** (pesquisa web via DuckDuckGo)

- **beautifulsoup4** (extração de conteúdo de páginas web)

- **PyMuPDF** / **python-docx** / **openpyxl** / **python-pptx** (extração de texto de documentos)

- **pillow-heif** (suporte a HEIC e AVIF no Pillow)

- **svglib** + **reportlab** (renderização de SVG para análise visual)

- **pystray** (ícone na bandeja do sistema — system tray)

### Inferência

- **llama.cpp server** (binário pré-compilado — NVIDIA CUDA, AMD HIP/ROCm, Apple Metal ou CPU)

- Detecção automática de GPU (NVIDIA, AMD, Apple Silicon) com fallback para CPU

- Gerenciado automaticamente pelo backend — download, inicialização e fallback

---

## Estrutura do projeto

```

myAIplayground/

├── frontend/          # React + Vite (interface web)

│   └── src/

│       ├── components/   # Sidebar, ChatLayout, Composer, MessageList...

│       ├── lib/          # API client, preferências, speech, i18n

│       └── locales/      # pt-BR.json, en-US.json, es-ES.json, fr-FR.json

├── backend/           # FastAPI (API + serviços)

│   └── app/

│       ├── api/routes/   # chat, conversations, health, legal, models, settings

│       ├── core/         # config (pydantic-settings)

│       └── services/     # chat, model, storage, input_adapter, document, web, filesystem, whisper

├── data/              # Dados locais (ignorados pelo git)

│   ├── user/             # Dados do usuário (preservados na desinstalação)

│   │   ├── app.db           # SQLite com conversas e mensagens

│   │   └── uploads/         # Arquivos enviados nas conversas

│   └── system/           # Dados do sistema (removidos na desinstalação)

│       ├── model-cache/     # GGUF e mmproj baixados do HF

│       ├── llama-server/    # Binário do llama-server

│       └── logs/            # Logs da aplicação

├── docs/              # Documentação adicional

├── scripts/           # Scripts utilitários (install, run, tray, release, i18n, test)

├── install.cmd        # Instalação automatizada (Windows)

├── tray.cmd           # Launcher com ícone na bandeja (Windows)

├── run.cmd            # Launcher modo diagnóstico (Windows)

├── install.sh         # Instalação automatizada (Linux / macOS)

├── tray.sh            # Launcher com ícone na bandeja (Linux / macOS)

├── run.sh             # Launcher modo diagnóstico (Linux / macOS)

└── README.md

```

---

## Privacidade

- As conversas são salvas apenas em `data/user/app.db` (local).

- Arquivos enviados ficam em `data/user/uploads/` (local).

- Os ajustes do aplicativo (idioma, voz, instruções personalizadas, pastas permitidas etc.) ficam em `data/user/settings.json` (local).

- O download inicial dos modelos vem do Hugging Face. Após isso, tudo roda offline.

- **Pesquisa web**: quando ativada nos Ajustes, o modelo pode fazer buscas no DuckDuckGo e acessar páginas web. Essas requisições saem da sua máquina. Desative nos Ajustes para modo totalmente offline.

- **Acesso a arquivos locais**: quando ativado nos Ajustes, o modelo pode ler arquivos **somente** das pastas que você permitiu explicitamente. Acesso é READ-ONLY e protegido contra travessia de diretório.

- A funcionalidade de Text-to-Speech usa a API `speechSynthesis` do navegador. O comportamento (local vs. online) depende da voz selecionada e da configuração do sistema.

---

## Modelos de IA (Gemma 4)

> **Aviso:** Este projeto **não é afiliado, patrocinado nem endossado pelo Google ou pela Alphabet Inc.** "Gemma 4" é uma marca do Google. Os modelos Gemma 4 são utilizados sob os termos de licenciamento disponibilizados pelo Google.

Os modelos de IA utilizados por esta aplicação (família **Google Gemma 4**) **não são distribuídos** com este repositório. Eles são baixados diretamente do [Hugging Face](https://huggingface.co/) mediante solicitação do usuário e estão sujeitos aos [Termos de Uso do Gemma 4 - Apache 2.0](https://ai.google.dev/gemma/apache_2) do Google.

Ao baixar e usar esses modelos, você concorda em cumprir os termos do Google, que incluem restrições à geração de conteúdo prejudicial, ilegal ou enganoso.

---

## Licença

Este projeto é licenciado sob a [Apache License 2.0](LICENSE).

```

Copyright 2026 Rodolfo Motta Saraiva

```

Criado por [Rodolfo Motta Saraiva](https://rmsaraiva.com/) como projeto pessoal de código aberto.

### Componentes de terceiros

| Componente                                                                                              | Licença                                                               |

| ------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------- |

| [llama.cpp](https://github.com/ggml-org/llama.cpp)                                                         | MIT                                                                    |

| [FastAPI](https://github.com/tiangolo/fastapi)                                                             | MIT                                                                    |

| [Uvicorn](https://github.com/encode/uvicorn)                                                               | BSD-3-Clause                                                           |

| [SQLAlchemy](https://github.com/sqlalchemy/sqlalchemy)                                                     | MIT                                                                    |

| [Pydantic](https://github.com/pydantic/pydantic)                                                           | MIT                                                                    |

| [React](https://github.com/facebook/react)                                                                 | MIT                                                                    |

| [Vite](https://github.com/vitejs/vite)                                                                     | MIT                                                                    |

| [react-markdown](https://github.com/remarkjs/react-markdown)                                               | MIT                                                                    |

| [KaTeX](https://github.com/KaTeX/KaTeX) (via rehype-katex)                                                 | MIT                                                                    |

| [Hugging Face Hub](https://github.com/huggingface/huggingface_hub)                                         | Apache 2.0                                                             |

| [faster-whisper](https://github.com/SYSTRAN/faster-whisper)                                                | MIT                                                                    |

| [Pillow](https://github.com/python-pillow/Pillow)                                                          | HPND                                                                   |

| [pillow-heif](https://github.com/bigcat88/pillow_heif)                                                     | BSD-3-Clause                                                           |

| [PyMuPDF](https://github.com/pymupdf/PyMuPDF)                                                              | AGPL-3.0                                                               |

| [python-docx](https://github.com/python-openxml/python-docx)                                               | MIT                                                                    |

| [openpyxl](https://foss.heptapod.net/openpyxl/openpyxl)                                                    | MIT                                                                    |

| [python-pptx](https://github.com/scanny/python-pptx)                                                       | MIT                                                                    |

| [duckduckgo-search](https://github.com/deedy5/duckduckgo_search)                                           | MIT                                                                    |

| [beautifulsoup4](https://www.crummy.com/software/BeautifulSoup/)                                           | MIT                                                                    |

| [svglib](https://github.com/deeplook/svglib)                                                               | LGPL-3.0                                                               |

| [reportlab](https://www.reportlab.com/dev/opensource/)                                                     | BSD-3-Clause                                                           |

| [pystray](https://github.com/moses-palmer/pystray)                                                         | LGPL-3.0                                                               |

| [Inno Setup](https://jrsoftware.org/isinfo.php) (instalador Windows)                                       | [Inno Setup License](https://jrsoftware.org/files/is/license.txt)         |

| [Google Gemma 4](https://ai.google.dev/gemma/docs/core/model_card_4) (modelos de IA — não distribuídos) | [Gemma 4 Terms of Use - Apache 2.0](https://ai.google.dev/gemma/apache_2) |

---

## English Summary

**My AI Playground** is an open-source, desktop-local application for chatting with [Google Gemma 4](https://ai.google.dev/gemma/docs/core/model_card_4) AI models running entirely on your machine. It features a modern web UI, multimodal input (text, images, audio, files), and conversation history stored only locally.

> **Disclaimer:** This project is **not affiliated with, sponsored by, or endorsed by Google or Alphabet Inc.** "Gemma 4" is a trademark of Google. The Gemma 4 models are used under the licensing terms provided by Google [Gemma 4 Terms of Use - Apache 2.0](https://ai.google.dev/gemma/apache_2).

### Features

- **Multimodal chat** — text, images, audio and files in a single conversation; multiple attachments per message. Uploaded audio files are treated as conversation attachments, with automatic audio-track extraction from containers such as MP4. Image formats: PNG, JPEG, WebP, GIF, SVG, HEIC/HEIF, AVIF, BMP, ICO, TIFF.

- **Gemma models** — Gemma 4 E2B, E4B, 12B Unified and 26B-A4B via GGUF; switch models at any time from the UI.

- **Text files** — 60+ code and data extensions (`.py`, `.ts`, `.json`, `.csv`, `.xml`, `.yaml`, `.sql`, `.rs`, `.go`…) read as plain text.

- **Documents** — PDF, Word (`.docx`), Excel (`.xlsx`) and PowerPoint (`.pptx`) with automatic text extraction.

- **Web search** — DuckDuckGo search and page reading; the model cites sources with numbered references (`[1]`, `[2]`…).

- **Local filesystem access** — the model can list and read files from folders you explicitly allow (read-only, directory-traversal protected).

- **Local image vision** — on vision-capable models (all available Gemma 4 models), the model can see and describe images from allowed folders.

- **Tool calling** — the model calls tools (web, filesystem, vision) automatically; calls are persisted and rendered auditably in the history.

- **Custom instructions** — user-defined system prompt applied to all conversations.

- **Local inference** — llama.cpp server with CUDA, flash attention, per-model context (128K–256K tokens).

- **Streaming** — responses rendered token by token in real time.

- **Auto-continuation** — long responses continue automatically when the token limit is reached (up to 5 rounds).

- **Voice recording** — record, pause, resume and stop before sending; audio converted to 16 kHz WAV with a countdown timer.

- **Transcription** — Whisper (faster-whisper) converts audio to text locally when the active model or engine cannot process native audio.

- **Text-to-Speech** — Web Speech API with preference for Microsoft voices.

- **Rich Markdown** — GFM rendering, syntax-highlighted code blocks, KaTeX math.

- **Message editing** — edit sent messages and regenerate responses.

- **Message search** — search conversations by title or content with term highlighting.

- **Location** — optional geolocation sharing for more context-aware answers (off by default).

- **Risk evaluation** — custom instructions are evaluated automatically by the LLM; alerts shown only when risk is significant.

- **i18n** — Portuguese (BR), English (US), Spanish and French; auto-detects browser language.

- **Dark theme** — minimalist responsive UI.

- **Sliding window** — automatic context management: long-content truncation, old-message eviction and retry on overflow.

### System Requirements

| Model                     | RAM   | VRAM (GPU)                                                     | Disk   |

| ------------------------- | ----- | -------------------------------------------------------------- | ------ |

| Gemma 4 E2B (2B)          | 8 GB  | 4 GB (NVIDIA CUDA, AMD HIP/ROCm or Apple Metal — or CPU-only) | ~3 GB  |

| Gemma 4 E4B (4B)          | 16 GB | 7 GB                                                           | ~5 GB  |

| Gemma 4 12B Unified (12B) | 24 GB | 10 GB                                                          | ~8 GB  |

| Gemma 4 26B-A4B (26B MoE) | 32 GB | 16 GB+                                                         | ~15 GB |

> Without sufficient VRAM, llama.cpp will offload layers to system RAM (CPU mode), resulting in significantly slower inference. If you encounter **OOM errors**, try a smaller model or reduce `N_CTX` in `data/system/.env`.

### Installation

#### Prerequisites

- **Python 3.11+** (on Linux/macOS, the `venv` module — `python3-venv` on Ubuntu/Debian)

- **Node.js 20+**

- **GPU** with up-to-date drivers — NVIDIA (CUDA), AMD (HIP on Windows, ROCm on Linux) or Apple Silicon (Metal) — recommended; runs on CPU otherwise

- **Windows 10/11 (64-bit)**, modern Linux, or macOS (arm64/x64)

- On Linux/macOS: `curl`, `unzip` and `tar`

#### Windows — graphical installer

A signed `.exe` installer (built with [Inno Setup](https://jrsoftware.org/isinfo.php)) is available on the [releases page](https://github.com/xBrasil/myAIplayground/releases). The wizard copies files, creates Start Menu and Desktop shortcuts, and optionally runs the dependency setup at the end.

> **Note:** Python 3.11+ and Node.js 20+ are installed automatically via `winget` when missing. If you chose the per-user install (without running as Administrator), a UAC prompt will appear at that moment to elevate just that step — the rest of the setup keeps running under your user account. Requires `winget` (App Installer) available on Windows; otherwise install Python/Node manually beforehand.

#### Windows — via scripts

```powershell

install.cmd

```

The installer:

- Detects and installs Python and Node.js automatically via `winget` (prompting for UAC elevation if needed)

- Creates the `.venv` virtual environment and installs backend dependencies

- Installs frontend npm dependencies

- Downloads the latest `llama-server` binary (CUDA or CPU, according to your GPU)

- Creates `data/system/.env` from `backend/.env.example`

Launch with:

```powershell

run.cmd

```

#### Linux / macOS

```bash

chmod +x install.sh run.sh

./install.sh

./run.sh

```

The installer performs the same steps as the Windows version: creates `.venv`, installs dependencies, downloads `llama-server` and prepares `.env`. The script auto-detects macOS (arm64/x64) and Linux to pick the correct `llama-server` binary. Press `Ctrl+C` to stop.

#### What the launcher does

- Starts backend (FastAPI on port 8000) and frontend (Vite on port 5173)

- Waits for both to be ready and opens the UI in your browser

- Reuses already-running services — safe to run more than once

- Logs saved to `data/system/logs/backend.log` and `data/system/logs/frontend.log`

> **Tip:** the first use of each model triggers a GGUF download from Hugging Face. Models are cached in `data/system/model-cache/`.

### Privacy

- Conversations are stored in a local SQLite database (`data/user/app.db`).

- Uploaded files stay in `data/user/uploads/`.

- App settings (language, voice, custom instructions, allowed folders, etc.) are stored in `data/user/settings.json`.

- Initial model downloads come from Hugging Face. After that, everything runs offline.

- **Web search**: when enabled in Settings, the model can query DuckDuckGo and fetch pages. Those requests leave your machine — disable it for fully offline mode.

- **Local filesystem access**: when enabled, the model can read files **only** from folders you explicitly allowed. Access is read-only and protected against directory traversal.

- Text-to-Speech uses the browser's `speechSynthesis` API; local vs. online behavior depends on the selected voice and system settings.

- No analytics or telemetry.

### Key points

- **100% local inference** — all AI processing runs on your hardware via [llama.cpp](https://github.com/ggml-org/llama.cpp) (GGUF format). No data is sent to cloud services during normal chat use.

- **Gemma 4 models are not included** — they are downloaded from [Hugging Face](https://huggingface.co/) at the user's request and are subject to [Gemma 4 Terms of Use - Apache 2.0](https://ai.google.dev/gemma/apache_2).

- **Stack**: React 19 + TypeScript + Vite (frontend), FastAPI + SQLAlchemy (backend), llama.cpp server (inference), faster-whisper (speech-to-text).

- **License**: [Apache License 2.0](LICENSE) — Copyright 2026 Rodolfo Motta Saraiva.

### Third-party components

| Component                                                                 | License                                                                |

| ------------------------------------------------------------------------- | ---------------------------------------------------------------------- |

| [llama.cpp](https://github.com/ggml-org/llama.cpp)                           | MIT                                                                    |

| [FastAPI](https://github.com/tiangolo/fastapi)                               | MIT                                                                    |

| [Uvicorn](https://github.com/encode/uvicorn)                                 | BSD-3-Clause                                                           |

| [SQLAlchemy](https://github.com/sqlalchemy/sqlalchemy)                       | MIT                                                                    |

| [Pydantic](https://github.com/pydantic/pydantic)                             | MIT                                                                    |

| [React](https://github.com/facebook/react)                                   | MIT                                                                    |

| [Vite](https://github.com/vitejs/vite)                                       | MIT                                                                    |

| [react-markdown](https://github.com/remarkjs/react-markdown)                 | MIT                                                                    |

| [KaTeX](https://github.com/KaTeX/KaTeX) (via rehype-katex)                   | MIT                                                                    |

| [Hugging Face Hub](https://github.com/huggingface/huggingface_hub)           | Apache 2.0                                                             |

| [faster-whisper](https://github.com/SYSTRAN/faster-whisper)                  | MIT                                                                    |

| [Pillow](https://github.com/python-pillow/Pillow)                            | HPND                                                                   |

| [pillow-heif](https://github.com/bigcat88/pillow_heif)                       | BSD-3-Clause                                                           |

| [PyMuPDF](https://github.com/pymupdf/PyMuPDF)                                | AGPL-3.0                                                               |

| [python-docx](https://github.com/python-openxml/python-docx)                 | MIT                                                                    |

| [openpyxl](https://foss.heptapod.net/openpyxl/openpyxl)                      | MIT                                                                    |

| [python-pptx](https://github.com/scanny/python-pptx)                         | MIT                                                                    |

| [duckduckgo-search](https://github.com/deedy5/duckduckgo_search)             | MIT                                                                    |

| [beautifulsoup4](https://www.crummy.com/software/BeautifulSoup/)             | MIT                                                                    |

| [svglib](https://github.com/deeplook/svglib)                                 | LGPL-3.0                                                               |

| [reportlab](https://www.reportlab.com/dev/opensource/)                       | BSD-3-Clause                                                           |

| [pystray](https://github.com/moses-palmer/pystray)                           | LGPL-3.0                                                               |

| [Inno Setup](https://jrsoftware.org/isinfo.php) (Windows installer)          | [Inno Setup License](https://jrsoftware.org/files/is/license.txt)         |

| [Google Gemma 4](https://ai.google.dev/gemma) (AI models — not distributed) | [Gemma 4 Terms of Use - Apache 2.0](https://ai.google.dev/gemma/apache_2) |
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/xbrasil/myaiplayground

Awesome Lists containing this project

README

My AI Playground