https://github.com/paladini/echo-transcribe
An open-source desktop application for audio transcription using local AI. Private, secure and efficient.
https://github.com/paladini/echo-transcribe
ai free open-source speach-to-text srt srt-subtitles transcribe transcriber whisper whisper-ai
Last synced: about 1 month ago
JSON representation
An open-source desktop application for audio transcription using local AI. Private, secure and efficient.
- Host: GitHub
- URL: https://github.com/paladini/echo-transcribe
- Owner: paladini
- License: mit
- Created: 2025-08-24T03:06:37.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2025-09-03T02:47:41.000Z (about 1 month ago)
- Last Synced: 2025-09-03T04:22:35.142Z (about 1 month ago)
- Topics: ai, free, open-source, speach-to-text, srt, srt-subtitles, transcribe, transcriber, whisper, whisper-ai
- Language: TypeScript
- Homepage:
- Size: 2.87 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Authors: AUTHORS.md
Awesome Lists containing this project
README
# EchoTranscribe 🎙️
An open-source desktop application for audio transcription (Speech-To-Text) using local AI. Private, secure and efficient.
![]()
## ✨ Features
- 🔒 **Completely Local**: Your audio files never leave your computer
- 🤖 **Advanced AI**: Uses Whisper models for high-quality transcription
- 🎨 **Modern Interface**: Clean and intuitive design with dark theme support
- 📁 **Multiple Formats**: Support for MP3, WAV, FLAC, M4A, OGG and WebM
- 🔄 **Batch Transcription**: Process multiple files simultaneously
- 🌍 **Automatic Detection**: Automatically identifies audio language
- ⏱️ **Precise Timestamps**: Word-level timestamps for detailed navigation
- 💾 **Flexible Export**: Export to TXT, SRT or JSON
- ⚙️ **Persistent Settings**: Dark/light theme and language saved between sessions
- 🌐 **Multilingual**: Interface in English, Portuguese and Spanish (expandable)
- ⚡ **Performance**: Optimized for speed and efficiency
- 🖥️ **Cross-Platform**: Works on Windows, macOS and Linux## 🚀 Quick Start
### Prerequisites
- **Node.js** (version 18 or higher)
- **Python** (version 3.8 or higher)
- **Rust** (for Tauri compilation)#### Linux (Ubuntu/Debian)
```bash
sudo apt update
sudo apt install libwebkit2gtk-4.0-dev libssl-dev libgtk-3-dev libayatana-appindicator3-dev librsvg2-dev libjavascriptcoregtk-4.0-dev
```#### macOS
```bash
# Using Homebrew
brew install --cask xcode-command-line-tools
```#### Windows
On Windows, you'll need Microsoft Visual Studio C++ Build Tools.### Development Installation
1. **Clone the repository**
```bash
git clone https://github.com/paladini/echo-transcribe.git
cd echo-transcribe
```2. **Install Node.js dependencies**
```bash
npm install
```3. **Setup Python environment and backend**
```bash
# The startup script will create venv and install dependencies automatically
chmod +x start-backend.sh # Linux/macOS only
```4. **Run in development mode**
**Option A: Quick Start (Recommended)**
```bash
# Terminal 1 - Start backend
./start-backend.sh # Linux/macOS
# or
./start-backend.bat # Windows# Terminal 2 - Start frontend (Tauri v2)
npm run tauri dev
```**Option B: Manual setup**
```bash
# Terminal 1 - Start backend
cd src-tauri/backend
python main.py# Terminal 2 - Start frontend
npm run tauri dev
```5. **Verify setup**
- Backend API: http://localhost:8000/docs
- Frontend: Opens automatically in Tauri window> 📖 **For detailed development setup**, see [DEVELOPMENT.md](DEVELOPMENT.md)
### Production Installation
Download the latest version from [Releases](https://github.com/paladini/echo-transcribe/releases) for your operating system.
## 🎯 How to Use
1. **Select audio file(s)**
- Drag and drop one or multiple files to the designated area
- Or click to select files (maximum 10 at once)2. **Choose AI model**
- **Tiny/Base**: Fast, ideal for testing
- **Small**: Better quality, medium speed
- **Medium**: High quality, slower3. **Configure options**
- Leave automatic language detection enabled (recommended)
- Or manually specify the audio language4. **Start transcription**
- Click "Start Transcription"
- Track progress in real-time
- For batches, see progress for each file5. **View and edit results**
- See transcribed text for each file
- Navigate through word timestamps
- Edit text if necessary6. **Export results**
- Export individually or in batch
- Available formats: TXT, SRT, JSON7. **Configure application**
- Access settings to customize theme and language
- Your preferences are automatically saved for future sessions## 🛠️ Technologies
- **Frontend**: React + TypeScript + Tailwind CSS
- **Desktop**: Tauri (Rust)
- **Backend**: FastAPI (Python)
- **AI**: faster-whisper (OpenAI Whisper)
- **UI Components**: Radix UI + shadcn/ui## 📋 Available Commands
```bash
# Development (Tauri v2)
npm run tauri dev # Start Tauri v2 application in development mode
npm run dev # Start frontend development server only (Vite)# Production (Tauri v2)
npm run build # Build frontend
npm run tauri build # Build complete application (generates executable)# Backend (Python)
cd src-tauri/backend
python main.py # Start standalone backend server# Other useful commands
npm run preview # Preview built frontend
npm run tauri --version # Check Tauri CLI version
```### 🏗️ **Building and Running Executable**
After running `npm run tauri build`, you can find and execute the generated files:
```bash
# Direct executable
./src-tauri/target/release/echo-transcribe# AppImage (Recommended for distribution)
chmod +x src-tauri/target/release/bundle/appimage/EchoTranscribe_0.1.0_amd64.AppImage
./src-tauri/target/release/bundle/appimage/EchoTranscribe_0.1.0_amd64.AppImage# Install .deb package (Ubuntu/Debian)
sudo dpkg -i src-tauri/target/release/bundle/deb/EchoTranscribe_0.1.0_amd64.deb
echo-transcribe # Run from anywhere after installation# Install .rpm package (Red Hat/Fedora)
sudo rpm -i src-tauri/target/release/bundle/rpm/EchoTranscribe-0.1.0-1.x86_64.rpm
```## 🔧 Configuration
### AI Models
Echo-Transcribe automatically downloads AI models as needed. Models are stored in:
- **Linux/macOS**: `~/.echo-transcribe/models/`
- **Windows**: `%USERPROFILE%\\.echo-transcribe\\models\\`### Supported Formats
| Format | Extension | Max Size |
|--------|-----------|----------|
| MP3 | .mp3 | 500MB |
| WAV | .wav | 500MB |
| FLAC | .flac | 500MB |
| M4A | .m4a | 500MB |
| OGG | .ogg | 500MB |
| WebM | .webm | 500MB |## 🐛 Troubleshooting
### Common Issues
**Error: "Load Failed"**
- This usually means the Python backend isn't running
- Make sure Python 3.8+ is installed on your system
- The application will automatically install Python dependencies on first run
- If the problem persists, try:
1. Close and reopen the application
2. Check if port 8000 is available
3. Install dependencies manually: `cd src-tauri/backend && pip install -r requirements.txt`**Error: "Model not found"**
- The model will be downloaded automatically on first run
- Check your internet connection**Error: "Unsupported file format"**
- Check if the file is in one of the supported formats
- Try converting the file to MP3 or WAV**Application won't open on Linux**
- Check if all system dependencies are installed
- Run: `sudo apt install libwebkit2gtk-4.0-37`### Debug Logs
Application logs are located at:
- **Linux/macOS**: `~/.echo-transcribe/logs/`
- **Windows**: `%USERPROFILE%\\.echo-transcribe\\logs\\`## 🤝 Contributing
Contributions are very welcome! Please read our [Contributing Guide](CONTRIBUTING.md) to get started.
### Local Development
1. Fork the project
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request## 📝 Roadmap
- [x] **v0.1.0** ✅ **COMPLETED**
- [x] Batch transcription support
- [x] Automatic language detection
- [x] Precise word-level timestamps
- [x] Export to multiple formats (TXT, SRT, JSON)
- [x] Settings screen with persistence
- [x] Theme support (light/dark)
- [x] Localization system (EN/PT/ES)
- [ ] **v0.2.0**
- [ ] Support for more AI models
- [ ] Timestamp interface improvements
- [ ] Community language support- [ ] **Future Versions**
- [ ] Custom model training interface
- [ ] Complete REST API
- [ ] Audio streaming support
- [ ] Plugin marketplace## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🙏 Acknowledgments
- [OpenAI](https://openai.com/) for the Whisper model
- [Tauri](https://tauri.app/) for the desktop framework
- [FastAPI](https://fastapi.tiangolo.com/) for the backend framework
- [shadcn/ui](https://ui.shadcn.com/) for UI components## 📞 Support
- Issues: [GitHub Issues](https://github.com/paladini/echo-transcribe/issues)
- 💬 Discussions: [GitHub Discussions](https://github.com/paladini/echo-transcribe/discussions)
- 👤 Author: [github.com/paladini](https://github.com/paladini)---
**EchoTranscribe** - Transforming audio to text with privacy and quality. 🎙️✨