{"id":24505911,"url":"https://github.com/richardr1126/openreader-webui","last_synced_at":"2025-04-09T06:12:51.682Z","repository":{"id":273066984,"uuid":"918541572","full_name":"richardr1126/OpenReader-WebUI","owner":"richardr1126","description":"Web EPUB and PDF text to speech document reader. Read documents in realtime with high-quality TTS; or extract audiobooks. Use your own Kokoro TTS API or Open AI API endpoint.","archived":false,"fork":false,"pushed_at":"2025-03-31T03:26:59.000Z","size":3023,"stargazers_count":110,"open_issues_count":1,"forks_count":16,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-03-31T04:24:07.682Z","etag":null,"topics":["document","epub","epub-reader","epubjs","gpt-4o-mini-tts","kokoro-tts","open-ai","open-ai-tts","openai-api","pdf","pdf-reader","pdfjs","text-to-speech","tts"],"latest_commit_sha":null,"homepage":"https://openreader.richardr.dev","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/richardr1126.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":null,"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"lfx_crowdfunding":null,"polar":null,"buy_me_a_coffee":"rich2556","thanks_dev":null,"custom":null}},"created_at":"2025-01-18T07:36:14.000Z","updated_at":"2025-03-31T03:27:03.000Z","dependencies_parsed_at":"2025-02-04T11:22:13.043Z","dependency_job_id":"ce6c2598-3a7a-43e0-8576-6981ce3d5e7e","html_url":"https://github.com/richardr1126/OpenReader-WebUI","commit_stats":null,"previous_names":["richardr1126/openreader-webui"],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardr1126%2FOpenReader-WebUI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardr1126%2FOpenReader-WebUI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardr1126%2FOpenReader-WebUI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardr1126%2FOpenReader-WebUI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/richardr1126","download_url":"https://codeload.github.com/richardr1126/OpenReader-WebUI/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247987285,"owners_count":21028895,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["document","epub","epub-reader","epubjs","gpt-4o-mini-tts","kokoro-tts","open-ai","open-ai-tts","openai-api","pdf","pdf-reader","pdfjs","text-to-speech","tts"],"created_at":"2025-01-21T23:32:18.913Z","updated_at":"2025-04-09T06:12:51.663Z","avatar_url":"https://github.com/richardr1126.png","language":"TypeScript","funding_links":["https://buymeacoffee.com/rich2556"],"categories":[],"sub_categories":[],"readme":"[![GitHub Stars](https://img.shields.io/github/stars/richardr1126/OpenReader-WebUI)](../../stargazers)\n[![GitHub Forks](https://img.shields.io/github/forks/richardr1126/OpenReader-WebUI)](../../network/members)\n[![GitHub Watchers](https://img.shields.io/github/watchers/richardr1126/OpenReader-WebUI)](../../watchers)\n[![GitHub Issues](https://img.shields.io/github/issues/richardr1126/OpenReader-WebUI)](../../issues)\n[![GitHub Last Commit](https://img.shields.io/github/last-commit/richardr1126/OpenReader-WebUI)](../../commits)\n[![GitHub Release](https://img.shields.io/github/v/release/richardr1126/OpenReader-WebUI)](../../releases)\n\n[![Discussions](https://img.shields.io/badge/Discussions-Ask%20a%20Question-blue)](../../discussions)\n\n# OpenReader WebUI 📄🔊\n\nOpenReader WebUI is a document reader with Text-to-Speech capabilities, offering a TTS read along experience with narration for both PDF and EPUB documents. It can use any OpenAI compatible TTS endpoint, including [Kokoro-FastAPI](https://github.com/remsky/Kokoro-FastAPI) and [Orpheus-FastAPI](https://github.com/Lex-au/Orpheus-FastAPI)\n\n\u003e Highly available demo currently available at [https://openreader.richardr.dev/](https://openreader.richardr.dev/)\n\n- 🎯 **TTS API Integration**: \n  - Compatible with OpenAI text to speech API and GPT-4o Mini TTS, Kokoro-FastAPI TTS, Orpheus FastAPI or any other compatible service\n  - Support for TTS models (tts-1, tts-1-hd, gpt-4o-mini-tts, kokoro, and custom)\n- 💾 **Local-First Architecture**: Uses IndexedDB browser storage for documents\n- 🛜 **Optional Server-side documents**: Manually upload documents to the next backend for all users to download\n- 📖 **Read Along Experience**: Follow along with highlighted text as the TTS narrates\n- 📄 **Document formats**: EPUB, PDF, DOCX (with libreoffice installed)\n- 🎧 **Audiobook Creation**: Create and export audiobooks from PDF and ePub files **(in m4b format with ffmpeg and aac TTS output)**\n- 📲 **Mobile Support**: Works on mobile devices, and can be added as a PWA web app\n- 🎨 **Customizable Experience**: \n  - 🔑 Set TTS API base URL (and optional API key)\n  - 🎯 Set model-specific instructions for GPT-4o Mini TTS\n  - 🏎️ Adjustable playback speed\n  - 📐 Customize PDF text extraction margins\n  - 🗣️ Multiple voice options (checks `/v1/audio/voices` endpoint)\n  - 🎨 Multiple app theme options\n\n\u003e Orpheus-FastAPI will only work through a [fork of Orpheus-FastAPI](https://github.com/richardr1126/LlamaCpp-Orpheus-FastAPI)\n\n### 🛠️ Work in progress\n- [x] **Audiobook creation and download** (m4b format)\n- [x] **Support for GPT-4o Mini TTS with instructions**\n- [x] **Intial e2e testing**: More playwright tests (in progress)\n- [x] **Orpheus-FastAPI support**: (in progress, submitted PR to Orpheus-FastAPI)\n- [ ] **More document formats**: .txt, .md, native .docx support\n- [ ] **Support non-OpenAI TTS APIs**: ElevenLabs, etc.\n- [ ] **Accessibility Improvements**\n\n## 🐳 Docker Quick Start\n\n### Prerequisites\n- Recent version of Docker installed on your machine\n- A TTS API server (Kokoro-FastAPI, Orpheus-FastAPI, or OpenAI API)\n\n```bash\ndocker run --name openreader-webui \\\n  -p 3003:3003 \\\n  -v openreader_docstore:/app/docstore \\\n  ghcr.io/richardr1126/openreader-webui:latest\n```\n\n(Optionally): Set the TTS `API_BASE` URL and/or `API_KEY` to be default for all devices\n```bash\ndocker run --name openreader-webui \\\n  -e API_BASE=http://host.docker.internal:8880/v1 \\\n  -p 3003:3003 \\\n  -v openreader_docstore:/app/docstore \\\n  ghcr.io/richardr1126/openreader-webui:latest\n```\n\n\u003e Requesting audio from the TTS API happens on the Next.js server not the client. So the base URL for the TTS API should be accessible and relative to the Next.js server. If it is in a Docker you may need to use `host.docker.internal` to access the host machine, instead of `localhost`.\n\nVisit [http://localhost:3003](http://localhost:3003) to run the app and set your settings.\n\n\u003e **Note:** The `openreader_docstore` volume is used to store server-side documents. You can mount a local directory instead. Or remove it if you don't need server-side documents.\n\n### ⬆️ Update Docker Image\n```bash\ndocker stop openreader-webui \u0026\u0026 \\\ndocker rm openreader-webui \u0026\u0026 \\\ndocker pull ghcr.io/richardr1126/openreader-webui:latest\n```\n\n### Adding to a Docker Compose (i.e. with open-webui or Kokoro-FastAPI)\n\n\u003e Note: This is an example of how to add OpenReader WebUI to a docker-compose file. You can add it to your existing docker-compose file or create a new one in this directory. Then run `docker-compose up --build` to start the services.\n\n```bash\n\nCreate or add to a `docker-compose.yml`:\n```yaml\nvolumes:\n  docstore:\n\nservices:\n  openreader-webui:\n    container_name: openreader-webui\n    image: ghcr.io/richardr1126/openreader-webui:latest\n    environment:\n      - API_BASE=http://host.docker.internal:8880/v1\n    ports:\n      - \"3003:3003\"\n    volumes:\n      - docstore:/app/docstore\n    restart: unless-stopped\n```\n\n## Dev Installation\n\n### Prerequisites\n- Node.js \u0026 npm (recommended: use [nvm](https://github.com/nvm-sh/nvm))\nOptionally required for different features:\n- [FFmpeg](https://ffmpeg.org) (required for audiobook m4b creation only)\n- [libreoffice](https://www.libreoffice.org) (required for DOCX files)\n  - On Linux: `sudo apt install ffmpeg libreoffice`\n  - On MacOS: `brew install ffmpeg libreoffice`\n\n### Steps\n\n1. Clone the repository:\n   ```bash\n   git clone https://github.com/richardr1126/OpenReader-WebUI.git\n   cd OpenReader-WebUI\n   ```\n\n2. Install dependencies:\n   ```bash\n   npm install\n   ```\n\n3. Configure the environment:\n   ```bash\n   cp template.env .env\n   # Edit .env with your configuration settings\n   ```\n   \u003e Note: The base URL for the TTS API should be accessible and relative to the Next.js server\n\n4. Start the development server:\n   ```bash\n   npm run dev\n   ```\n\n   or build and run the production server:\n   ```bash\n   npm run build\n   npm start\n   ```\n\n   Visit [http://localhost:3003](http://localhost:3003) to run the app.\n\n\n## 💡 Feature requests\n\nFor feature requests or ideas you have for the project, please use the [Discussions](https://github.com/richardr1126/OpenReader-WebUI/discussions) tab.\n\n## 🙋‍♂️ Support and issues\n\nIf you encounter issues, please open an issue on GitHub following the template (which is very light).\n\n## 👥 Contributing\n\nContributions are welcome! Fork the repository and submit a pull request with your changes.\n\n## ❤️ Acknowledgements\n\n- [Kokoro-FastAPI](https://github.com/remsky/Kokoro-FastAPI) for the API wrapper\n- [react-pdf](https://github.com/wojtekmaj/react-pdf)\n- [react-reader](https://github.com/happyr/react-reader)\n- [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) for text-to-speech\n\n## Docker Supported Architectures\n- linux/amd64 (x86_64)\n- linux/arm64 (Apple Silicon)\n\n## Stack\n\n- **Framework:** Next.js (React)\n- **Containerization:** Docker\n- **Storage:** IndexedDB (in browser db store)\n- **PDF:** \n  - [react-pdf](https://github.com/wojtekmaj/react-pdf)\n  - [pdf.js](https://mozilla.github.io/pdf.js/)\n- **EPUB:**\n  - [react-reader](https://github.com/gerhardsletten/react-reader)\n  - [epubjs](https://github.com/futurepress/epub.js/)\n- **UI:** \n  - [Tailwind CSS](https://tailwindcss.com)\n  - [Headless UI](https://headlessui.com)\n- **TTS:** (tested on)\n  - [OpenAI API](https://platform.openai.com/docs/api-reference/text-to-speech)\n  - [Kokoro FastAPI TTS](https://github.com/remsky/Kokoro-FastAPI/tree/v0.0.5post1-stable)\n- **NLP:** [compromise](https://github.com/spencermountain/compromise) NLP library for sentence splitting\n\n## License\n\nThis project is licensed under the MIT License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frichardr1126%2Fopenreader-webui","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frichardr1126%2Fopenreader-webui","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frichardr1126%2Fopenreader-webui/lists"}