https://github.com/yossifibrahem/llama-cpp-gui
https://github.com/yossifibrahem/llama-cpp-gui
Last synced: 22 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/yossifibrahem/llama-cpp-gui
- Owner: yossifibrahem
- Created: 2025-09-01T14:53:31.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2025-09-10T14:26:04.000Z (27 days ago)
- Last Synced: 2025-09-10T18:25:09.791Z (26 days ago)
- Language: HTML
- Size: 63.1 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# LLaMA Server GUI Manager
A comprehensive graphical user interface for managing and configuring the `llama-server` executable from the llama.cpp project. This application provides an intuitive way to configure, start, stop, and monitor your local LLaMA server without dealing with complex command-line arguments.
## Features
### 🎛️ Complete Configuration Management
- **Model Configuration**: Load GGUF models, LoRA adapters, and multimodal projectors
- **Chat Templates**: Support for various chat templates (llama3, chatml, mistral, etc.)
- **Generation Parameters**: Fine-tune sampling settings (temperature, top-k, top-p, etc.)
- **Performance Tuning**: Configure GPU layers, context size, batch sizes, and threading
- **Advanced Features**: Speculative decoding, continuous batching, flash attention### 🖥️ User-Friendly Interface
- **Tabbed Layout**: Organized sections for different configuration areas
- **Interactive Controls**: Sliders, dropdowns, and input fields with tooltips
- **Real-time Validation**: Input validation and helpful error messages
- **Custom Arguments**: Add any additional llama-server arguments not covered by the GUI### 💾 Configuration Management
- **Save/Load Configs**: Persistent JSON-based configuration storage
- **Portable Settings**: Config files stored alongside the executable
- **Backup-Friendly**: Human-readable JSON configuration format### 🔧 Server Management
- **One-Click Start/Stop**: Easy server process management
- **Live Output Monitoring**: Real-time server log display
- **Browser Integration**: Quick access to the web UI
- **System Tray Support**: Minimize to tray and run in background (optional)
![]()
![]()
![]()
![]()
![]()
![]()
![]()
## Requirements
- Python 3.7 or higher
- `llama-server` executable (from llama.cpp) in your PATH or application directory
- Required Python packages (automatically installed):
- `ttkbootstrap`
- `Pillow` (for system tray icon support)
- `pystray` (optional, for system tray functionality)## Installation
### Option 1: Download Pre-built Executable
1. Download the latest release from the GitHub releases page
2. Extract to your desired location
3. Ensure `llama-server` executable is in the same directory or your PATH
4. Run `LLaMA-Server-GUI.exe` (Windows) or `LLaMA-Server-GUI` (Linux/Mac)### Option 2: Run from Source
1. Clone this repository:
```bash
git clone https://github.com/yourusername/llama-server-gui.git
cd llama-server-gui
```2. Install dependencies:
```bash
pip install ttkbootstrap pillow pystray
```3. Run the application:
```bash
python llama-server_gui_new.py
```### Option 3: Build Your Own Executable
1. Install PyInstaller:
```bash
pip install pyinstaller ttkbootstrap pillow pystray
```2. Run the build script:
```bash
python build_exe.py
```3. Find your executable in the `dist` folder
## Usage
### Quick Start
1. **Load a Model**: Use the "Browse" button in the Models tab to select your GGUF model file
2. **Configure Settings**: Adjust parameters across the different tabs as needed
3. **Start Server**: Click the "Start Server" button
4. **Access Web UI**: Click "Open Browser" to access the llama.cpp web interface### Configuration Tabs
#### 📁 Models
- **Primary Model**: Select your main GGUF model file
- **Model Extensions**: Add LoRA adapters or multimodal projectors
- **Chat Behavior**: Configure chat templates and reasoning settings#### ⚙️ Generation
- **Output Control**: Set token limits and generation behavior
- **Sampling Parameters**: Fine-tune creativity and randomness#### 🚀 Performance
- **Core Performance**: Context size, GPU layers, CPU threads
- **Advanced Throughput**: Parallel processing and continuous batching#### 🔬 Advanced
- **Memory Optimizations**: Flash attention, memory locking, NUMA settings
- **Speculative Decoding**: Use draft models for faster inference#### 🌐 Server & API
- **Network Configuration**: Host, port, and API key settings
- **Custom Arguments**: Add any additional llama-server flags
- **Access Control**: Configure web UI and API access#### 📊 Server Output
- **Live Monitoring**: Real-time server log output
- **Log Management**: Clear output and monitor server status### Configuration Management
- **Save Config**: Preserve your current settings to `llama_server_config.json`
- **Load Config**: Restore previously saved configurations
- **Portable**: Config file is stored in the application directory### System Tray (Optional)
When pystray is installed, the application can minimize to the system tray:
- **Minimize to Tray**: Close the window to hide in the system tray
- **Tray Menu**: Right-click the tray icon for quick actions
- **Background Operation**: Keep the server running while GUI is hidden## Configuration File
The application saves settings in JSON format. Example structure:
```json
{
"model_path": "/path/to/your/model.gguf",
"ctx_size": 4096,
"gpu_layers": 33,
"host": "127.0.0.1",
"port": "8080",
"custom_arguments_list": [
{
"value": "--custom-flag value",
"enabled": true
}
]
}
```## Troubleshooting
### Common Issues
**"llama-server executable not found"**
- Ensure `llama-server` is in your PATH or the same directory as the GUI
- Download llama.cpp binaries from the official releases**"Module not found" errors**
- Install missing dependencies: `pip install ttkbootstrap pillow pystray`
- For source installation, ensure all requirements are met**Server won't start**
- Check that your model path is correct and the file exists
- Verify you have sufficient GPU memory for the selected GPU layers
- Review the Server Output tab for detailed error messages**Performance issues**
- Reduce context size for lower memory usage
- Adjust GPU layers based on your hardware capabilities
- Use smaller batch sizes for memory-constrained systems### Getting Help
- Check the Server Output tab for detailed error messages
- Review the generated command using "Generate Command" button
- Consult the llama.cpp documentation for parameter details## Building from Source
### Development Setup
```bash
git clone https://github.com/yourusername/llama-server-gui.git
cd llama-server-gui
pip install -r requirements.txt
python llama-server_gui_new.py
```### Building Executable
The included `build_exe.py` script uses PyInstaller to create a standalone executable:```bash
python build_exe.py
```This creates a single executable file in the `dist` directory that includes all dependencies.
## Contributing
Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.
### Development Guidelines
- Follow Python PEP 8 style guidelines
- Add tooltips for new UI elements
- Update configuration saving/loading for new parameters
- Test on multiple platforms when possible## License
This project is released under the MIT License. See the LICENSE file for details.
## Acknowledgments
- Built for the [llama.cpp](https://github.com/ggerganov/llama.cpp) project
- Uses [ttkbootstrap](https://github.com/israel-dryer/ttkbootstrap) for modern UI theming
- System tray support provided by [pystray](https://github.com/moses-palmer/pystray)## Changelog
### v1.0.0
- Initial release
- Complete GUI for llama-server configuration
- Configuration save/load functionality
- Real-time server output monitoring
- System tray support
- Cross-platform executable building---
**Note**: This GUI is a third-party tool for llama.cpp and is not officially affiliated with the llama.cpp project.