{"id":18836724,"url":"https://github.com/bedirt/narratorx","last_synced_at":"2026-02-08T21:32:16.520Z","repository":{"id":261375183,"uuid":"883903344","full_name":"BedirT/NarratorX","owner":"BedirT","description":"📖 NarratorX: Turn your PDFs into captivating audiobooks in 16 languages, powered by OCR, LLMs, and expressive TTS. Listen your way! 🎧✨","archived":false,"fork":false,"pushed_at":"2024-11-06T07:25:26.000Z","size":1526,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-11-06T08:27:18.673Z","etag":null,"topics":["audiobooks","automation","llm","multilingual","nlp","ocr","streamlit","text-to-speech","tts"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BedirT.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-05T19:24:53.000Z","updated_at":"2024-11-06T07:25:29.000Z","dependencies_parsed_at":"2024-11-06T08:29:49.024Z","dependency_job_id":"61e6f38f-3abc-4b46-8aa6-a382ca173c28","html_url":"https://github.com/BedirT/NarratorX","commit_stats":null,"previous_names":["bedirt/narratorx"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BedirT%2FNarratorX","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BedirT%2FNarratorX/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BedirT%2FNarratorX/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BedirT%2FNarratorX/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BedirT","download_url":"https://codeload.github.com/BedirT/NarratorX/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223622154,"owners_count":17174842,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audiobooks","automation","llm","multilingual","nlp","ocr","streamlit","text-to-speech","tts"],"created_at":"2024-11-08T02:31:38.436Z","updated_at":"2026-02-08T21:32:16.483Z","avatar_url":"https://github.com/BedirT.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"img/banner.png\" alt=\"NarratorX Banner\" /\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003cem\u003eTransform your PDFs into engaging audiobooks with the power of OCR, LLMs, and TTS in 16 languages.\u003c/em\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/License-GPLv3-blue.svg\" alt=\"License\" /\u003e\u003c/a\u003e\n    \u003ca href=\"https://www.python.org/downloads/\"\u003e\u003cimg src=\"https://img.shields.io/badge/python-3.8%2B-blue\" alt=\"Python Version\" /\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\n## Table of Contents\n\n- [Introduction](#introduction)\n- [Features](#features)\n- [Demo](#demo)\n- [Supported Languages](#supported-languages-)\n- [Roadmap](#roadmap)\n- [Installation](#installation)\n- [Usage](#usage)\n  - [Command-Line Interface](#command-line-interface)\n  - [Streamlit Web Application](#streamlit-web-application)\n- [Contributing](#contributing)\n  - [Style Guidelines](#style-guidelines)\n  - [Testing](#testing)\n- [License](#license)\n\n## Introduction\n\n🎉 Welcome to NarratorX! 🎉 For the longest time, I found myself saying, \"Ah shucks, there's no audiobook for this one… guess I'll have to wait or squeeze in some reading when I have free time.\" If you’re like me and love listening to audiobooks while doing chores 🧹, commuting 🚶, or just relaxing 😌, then I guess this tool was made for you!\n\n📚 With NarratorX, you can freely convert the PDF of that book that’s been lingering on your to-read list into a beautifully narrated, non-robotic audiobook. Say hello to NarratorX – savior of the busy and the lazy! By harnessing advanced OCR 🔍, powerful LLMs 🤖, and state-of-the-art TTS 🎙️, NarratorX transforms your PDFs into immersive audiobooks, ready to accompany you wherever you go. And all for free, my favorite kinda payment method...\n\nNarratorX is a passion project, specially crafted for my book club 📖💬. With support for 16 languages 🌍, it’s perfect for readers worldwide. I hope it brings you joy too! So go on, read away and listen away! 🎧✨\n\n## Features\n\n| 🚀 **Feature**                           | ✨ **Description** |\n|-----------------------------------------|--------------------|\n| **Seamless PDF to Audiobook Conversion** 🎧 | Effortlessly convert your PDFs into high-quality audiobooks using OCR 🔍, LLMs 🤖, and TTS 🎙️, fully automating the process from text extraction to speech generation. |\n| **Multi-Language Support** 🌍             | Supports 16 languages, including English, Spanish, French, German, Turkish, and more, making NarratorX accessible to a global audience. |\n| **Open-Source LLM Integration** 🔓        | Leverage the Ollama library for local, open-source LLMs. OpenAI is optional, offering flexibility for users with different needs. |\n| **User-Friendly Interfaces** 🖥️           | Use either the command-line interface for quick, scriptable conversions or the Streamlit web app for a visual, interactive experience. |\n| **Customizable Settings** 🎛️              | Fine-tune your experience by adjusting chunk sizes, selecting models, and choosing language preferences to suit each document. |\n| **Robust Error Handling and Logging** 🛠️  | With comprehensive error handling and logging, NarratorX ensures smooth operation and easy troubleshooting if issues arise. |\n\n\n---\n\n## Demo\n\n\n\nhttps://github.com/user-attachments/assets/eeb3157d-8d2f-423f-97dd-da30439c7bf2\n\n\n\n\n🎧 **Sample Audiobook Conversions**\nHere are some sample audiobook conversions created with NarratorX! (Not cherry-picked – just pure NarratorX magic ✨)\n\n| 🌐 **Language** | 📚 **Book Title** | 🔊 **Sample Audio** |\n|----------------|--------------------|---------------------|\n| 🇺🇸 **English** | *\"The Little Prince\"* by Antoine de Saint-Exupéry | [Listen 🎶](https://github.com/user-attachments/assets/59179257-19aa-431f-b7ba-2419a234e3bc) |\n| 🇪🇸 **Spanish** | *\"Don Quijote\"* by Miguel de Cervantes | [Escuchar 🎶](https://github.com/user-attachments/assets/3605936c-2644-4ddf-888f-6128f9adab10) |\n| 🇫🇷 **French** | *\"Les Misérables\"* by Victor Hugo | [Écouter 🎶](https://github.com/user-attachments/assets/c5aacc01-333b-4d47-94a3-ab4d53fa83f7) |\n| 🇹🇷 **Turkish** | *\"Tüfek, Mikrop ve Çelik\"* by Jared Diamond | [Dinle 🎶](https://github.com/user-attachments/assets/7b0dfcf0-8c16-4f5a-b925-d965d6c3b033) |\n\n---\n\n## Supported Languages 🌍\n\nNarratorX supports a wide array of languages, making it accessible for many readers around the world! 🌐 The currently supported languages are:\n\n- 🇺🇸 **English** (en)\n- 🇪🇸 **Spanish** (es)\n- 🇫🇷 **French** (fr)\n- 🇩🇪 **German** (de)\n- 🇮🇹 **Italian** (it)\n- 🇵🇹 **Portuguese** (pt)\n- 🇵🇱 **Polish** (pl)\n- 🇹🇷 **Turkish** (tr)\n- 🇷🇺 **Russian** (ru)\n- 🇳🇱 **Dutch** (nl)\n- 🇨🇿 **Czech** (cs)\n- 🇸🇦 **Arabic** (ar)\n- 🇨🇳 **Chinese** (zh-cn)\n- 🇯🇵 **Japanese** (ja)\n- 🇭🇺 **Hungarian** (hu)\n- 🇰🇷 **Korean** (ko)\n\nWith NarratorX, the world of audiobooks is just a language selection away! 🌎📖🎶\n\n## Roadmap\n\nHere's what's on the horizon:\n\n- [ ] **Enhanced LLM Options**: Integration with additional open-source LLMs and libraries.\n- [ ] **Custom TTS Voices**: Support for custom TTS voices and accents.\n- [ ] **Enhanced Chunking and OCR**: I am planning to move to [open-parse](https://github.com/Filimoa/open-parse) for better OCR results.\n- [ ] **Support for additional file formats**: EPUB, DOCX, Image, and plain text support.\n- [ ] **Automated Chunk Size Optimization**: Automatically adjust chunk sizes based on model capabilities and language requirements.\n- [ ] **Improved UI/UX**: Enhancements to the Streamlit app for a more intuitive user experience.\n  - Add a stopping option.\n  - Add step-wise processing - OCR, LLM, TTS.\n\nI welcome all contributions! If you have ideas, suggestions, or want to contribute, feel free to reach out.\n\n---\n\n## Installation\n\n### Prerequisites\n\nBefore you begin, ensure you have the following installed:\n\n- **Python 3.8 or higher**: [Download Python](https://www.python.org/downloads/)\n- **Poetry**: A tool for dependency management.\n  ```bash\n  pip install poetry\n  ```\n- **CUDA (Optional)**: For GPU acceleration with CUDA-compatible GPUs. This can significantly speed up processing times.\n\n### Steps\n\n1. **Clone the Repository**\n\n   Start by cloning the NarratorX repository to your local machine:\n\n   ```bash\n   git clone https://github.com/bedirt/narratorx.git\n   cd narratorx\n   ```\n\n2. **Install Dependencies**\n\n   Use Poetry to install all required dependencies:\n\n   ```bash\n   poetry install\n   ```\n\n3. **Set Up Environment Variables**\n\n   NarratorX can work with both OpenAI's GPT models and local open-source LLMs via the Ollama library.\n\n   - **If you want to use OpenAI Models**:\n\n     Obtain your OpenAI API key and set it as an environment variable:\n\n     ```bash\n     export OPENAI_API_KEY='your-openai-api-key'\n     ```\n\n4. **(Optional - Just for Contributing) Install Pre-commit Hooks**\n\n   If you plan to contribute to NarratorX, set up pre-commit hooks to maintain code quality:\n\n   ```bash\n   pre-commit install\n   ```\n\n---\n\n## Usage\n\nI implemented two ways to use NarratorX. You can choose between the command-line interface for quick operations or the Streamlit web app for a more interactive experience.\n\n### Command-Line Interface\n\nConvert your PDF to an audiobook directly from the terminal:\n\n```bash\nnarratorx path/to/yourfile.pdf --output output.wav --language en --model gpt-4o --log-level INFO\n```\n\n**Parameters Explained:**\n\n- `path/to/yourfile.pdf`: The path to the PDF you wish to convert.\n- `--output, -o`: (Optional) The path where the output audio file will be saved. Defaults to `output.wav`.\n- `--language, -l`: (Optional) The language code of your PDF content (e.g., `en` for English, `tr` for Turkish).\n- `--model, -m`: (Optional) The LLM model to use. Options include `gpt-4o`, `gpt-4o-mini`, or any model supported by the Ollama library like `llama3.1`. You can see ollama models [here](https://ollama.com/library). Do not forget to use `ollama/` prefix for ollama models.\n- `--max-characters-llm`: (Optional) Maximum characters per LLM chunk. Adjust based on model capabilities. 2-4k is a good starting point.\n- `--max-tokens`: (Optional) Maximum tokens per LLM call. Adjust based on model capabilities. 2-4k is a good starting point again.\n- `--max-characters-tts`: (Optional) Maximum characters per TTS chunk. This value should be changed based on the language you are using, if you get a warning `Warning: The text length exceeds the character limit of 239 for language 'es', this might cause truncated audio.` you should decrese this value to be the same as the warning message. (I will automate this soon, lazy at the moment :))\n- `--log-level`: (Optional) Set the logging level (`DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`).\n\n**Example:**\n\nUsing an open-source LLM with Ollama:\n\n```bash\nnarratorx path/to/yourfile.pdf --output output.wav --language en --model ollama/llama3.1 --log-level INFO\n```\n\n### Streamlit Web Application\n\nFor a more user-friendly interface, use the Streamlit app:\n\n```bash\nstreamlit run streamlit_app.py\n```\n\nOpen your web browser and navigate to `http://localhost:8501`. Here, you can:\n\n- Upload your PDF file (up to 200MB).\n- Select the language and preferred LLM model.\n- Adjust advanced settings such as chunk sizes (optional).\n- Monitor the progress of each processing step.\n- Listen to a preview and download the final audiobook.\n\n---\n\n## Contributing\n\nLove contributions! Whether it's bug fixes, new features, or documentation improvements, your help is invaluable, and I appreciate it.\n\n### Style Guidelines\n\nTo maintain code quality and consistency, please adhere to the following guidelines:\n\n- **Code Formatting**: Use `black` for code formatting with a line length of 100 characters. This ensures that all code follows a consistent style.\n\n  ```bash\n  black --line-length 100 .\n  ```\n\n- **Import Sorting**: Use `isort` with the `black` profile to sort imports correctly.\n\n  ```bash\n  isort --profile black .\n  ```\n\n- **Linting**: Use `flake8` to lint your code and catch potential issues.\n\n  ```bash\n  flake8\n  ```\n\n- **Pre-commit Hooks**: We have pre-commit hooks set up to automate these checks before each commit. Install them with:\n\n  ```bash\n  pre-commit install\n  ```\n\n  **Pre-commit Configuration:**\n\n  ```yaml\n  repos:\n    - repo: local\n      hooks:\n        - id: black\n          name: black\n          entry: poetry run black --line-length 100 --exclude '.venv/*'\n          language: system\n          types: [python]\n        - id: flake8\n          name: flake8\n          entry: poetry run flake8 --exclude .venv --config .flake8\n          language: system\n          types: [python]\n        - id: isort\n          name: isort\n          entry: poetry run isort --profile black --skip .venv\n          language: system\n          types: [python]\n  ```\n\n  You can ignore all others, and just install pre-commit hooks with the command above, and (since I already have the configuration file) you can use `pre-commit run --all-files` to run all checks.\n\n### Testing\n\nEnsure your changes don't break existing functionality by running tests:\n1. **Install Test Dependencies**\n\n    ```bash\n    poetry install --with dev\n    ```\n\n2. **Run Tests**\n\n    ```bash\n    python -m unittest discover tests/\n    ```\n\n3. **Test Specific Modules**\n\n    ```bash\n    python -m unittest tests/test_llm.py\n    ```\n\n### How to Contribute\n\n1. **Fork the Repository**\n\n   Click the \"Fork\" button at the top right corner of the repository page.\n\n2. **Clone Your Fork**\n\n   ```bash\n   git clone https://github.com/yourusername/narratorx.git\n   cd narratorx\n   ```\n\n3. **Create a New Branch**\n\n   ```bash\n   git checkout -b feature/your-feature-name\n   ```\n\n4. **Make Your Changes**\n\n   Write code, fix bugs, or improve documentation.\n\n5. **Format and Lint Your Code**\n\n   ```bash\n   black --line-length 100 .\n   isort --profile black .\n   flake8\n   ```\n\n   or simply:\n\n   ```bash\n   pre-commit run --all-files\n   ```\n\n6. **Commit Your Changes**\n\n   ```bash\n   git add .\n   git commit -m \"Add your descriptive commit message here\"\n   ```\n\n7. **Push to Your Fork**\n\n   ```bash\n   git push origin feature/your-feature-name\n   ```\n\n8. **Open a Pull Request**\n\n   Go to the original repository and open a pull request. Provide a clear description of your changes.\n\n---\n\n## License\n\nNarratorX is licensed under the **GNU General Public License v3.0**. By contributing to this project, you agree that your contributions will be licensed under its GPLv3.\n\nSee the [LICENSE](LICENSE) file for more details.\n\n---\n\n## Acknowledgements\n\nI made use of the following libraries and tools to build NarratorX and am grateful for all their contributions:\n- [OpenAI](https://openai.com)\n- [Ollama](https://ollama.com)\n- [Coqui/TTS](https://github.com/coqui-ai/TTS)\n- [Streamlit](https://streamlit.io)\n- [Surya-OCR](https://github.com/VikParuchuri/surya)\n- [Unstructured](https://github.com/Unstructured-IO/unstructured)\n- [LiteLLM](https://github.com/BerriAI/litellm)\n\n---\nThank you for using NarratorX! I am excited to see how you transform your reading experience. If you have any questions, suggestions, or need assistance, feel free to open an issue or reach out.\n\nYou can add me on goodreads [here](https://www.goodreads.com/bedirt).\n\nHappy listening! 🎧📖\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbedirt%2Fnarratorx","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbedirt%2Fnarratorx","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbedirt%2Fnarratorx/lists"}