{"id":27320126,"url":"https://github.com/brevdev/pdf2podcast","last_synced_at":"2025-06-30T01:35:36.376Z","repository":{"id":278698042,"uuid":"883502782","full_name":"brevdev/pdf2podcast","owner":"brevdev","description":null,"archived":false,"fork":false,"pushed_at":"2024-12-15T19:14:18.000Z","size":98979,"stargazers_count":2,"open_issues_count":3,"forks_count":1,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-04-12T09:12:51.399Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/brevdev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-05T04:31:56.000Z","updated_at":"2025-02-22T03:39:23.000Z","dependencies_parsed_at":"2025-02-21T07:35:01.368Z","dependency_job_id":"b71a3065-16b7-4e93-a09d-edda8e93f804","html_url":"https://github.com/brevdev/pdf2podcast","commit_stats":null,"previous_names":["brevdev/pdf2podcast"],"tags_count":26,"template":true,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brevdev%2Fpdf2podcast","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brevdev%2Fpdf2podcast/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brevdev%2Fpdf2podcast/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brevdev%2Fpdf2podcast/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/brevdev","download_url":"https://codeload.github.com/brevdev/pdf2podcast/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248543842,"owners_count":21121838,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-12T09:12:59.446Z","updated_at":"2025-04-12T09:13:00.627Z","avatar_url":"https://github.com/brevdev.png","language":"Python","readme":"# AI Research Assistant\n\n## Overview\nA microservice driven implementation for transforming PDFs into engaging audio content. For a deeper dive into the system architecture, please see the diagram below:\n\nYou can view a mermaid diagram of our system [here](docs/README.md).\n\n## Quick Start Guide\n\n1. **Environment Variables**:\n   We require the following environment variables to be set:\n   ```bash\n   # Create .env file with required variables\n   echo \"ELEVENLABS_API_KEY=your_key\" \u003e .env\n   echo \"NIM_KEY=your_key\" \u003e\u003e .env\n   echo \"MAX_CONCURRENT_REQUESTS=1\" \u003e\u003e .env\n   ```\n\n   Note that in production we use the NVIDIA Eleven Labs API key which can handle concurrent requests. For local development, you may want to set `MAX_CONCURRENT_REQUESTS=1` to avoid rate limiting issues. You can generate your own testing API key for free [here](https://elevenlabs.io/).\n\n2. **Install Dependencies**:\n   We use UV to manage python dependencies.\n   \n   ```bash\n   make uv\n   ```\n   This will:\n   - Install UV if not present\n   - Create virtual environment\n   - Install project dependencies\n\n   If you open up a new terminal window and want to quickly re-use the same environment, you can run `make uv` again.\n\n3. **Start Development Server**:\n   You can start the entire stack with:\n   ```bash\n   make all-services\n   ```\n\n   This command will:\n   - Verify environment variables are set\n   - Create necessary directories\n   - Start all services using Docker Compose in `--build` mode. \n\n   \u003e **Note:** The first time you run `make all-services`, the `docling` service may take 10-15 minutes to pull and build. Subsequent runs will be much faster.\n\n   You can also set `DETACH=1` to run the services in detached mode, which allows you to continue using your terminal while the services are running.\n\n4. **Run Podcast Generation**:\n   ```bash\n   source .venv/bin/activate\n   python tests/test.py --target \u003cpdf1.pdf\u003e --context \u003cpdf2.pdf\u003e\n   ```\n\n   This will generate a 2-person podcast. In order to generate a 1-person monologue, you can add the `--monologue` flag. Check out the test file for more examples. If you are not on a GPU machine, the PDF service might take a while to run.\n\n## Hosting the PDF service on a separate machine\n\nAs stated above, we use [docling](https://github.com/DS4SD/docling) as our default PDF service. When you spin up the stack, docling will be built and run automatically.\n\nIf you would like to run the PDF service on a separate machine, you can add the following to your `.env` file:\n```bash\necho \"MODEL_API_URL=\u003cpdf-model-service-url\" \u003e\u003e .env\n```\n\n### Using `nv-ingest`\n\nWe also support using a fork of NVIDIA's [NV-Ingest](https://github.com/NVIDIA/NV-Ingest) as our PDF service. This requires 2 A100-SXM machines. See the [repo](https://github.com/jdye64/nv-ingest/tree/brev-dev-convert-endpoint) for more information. If you would like to use this, you can add the following to your `.env` file:\n```bash\necho \"MODEL_API_URL=\u003cnv-ingest-url\u003e/v1\" \u003e\u003e .env\n```\n**Note the use of `v1` in the URL.**    \n\nHere is the workflow that we use for running this in production (disaggregated core services and pdf model service):\n```bash\n# On an L40s machine\nmake model-prod\n\n# On a different machine\nmake prod\n```\n\n## Selecting LLMs \n\nWe currently use an ensemble of 3 LLMS to generate these podcasts. Out of the box, we recommend using the LLama 3.1-70B NIM. If you would like to use a different model, you can update the `models.json` file with the desired model. The default `models.json` calls a NIM that I have currently hosted. Feel free to use it as you develop locally. When you deploy, please use our NIM API Catalog endpoints.\n\n## Optimizing for GPU usage\n\nDue to our design, it is relatively easy to swap out different pieces of our stack to optimize for GPU usage and available hardware. For example, you could swap each model with the smaller LLama 3.1-8B NIM and disable GPU usage for `docling` in `docker-compose.yaml`.\n\n## Development Tools\n\n### Tracing\nWe expose a Jaeger instance at `http://localhost:16686/` for tracing. This is useful for debugging and monitoring the system.\n\n### Code Quality\nThe project uses `ruff` for linting and formatting. You must run `make ruff` before your PR can be merged:\n```bash\nmake ruff  # Runs both lint and format\n```\n\n## CI/CD\nWe use GitHub Actions for CI/CD. We run the following actions:\n- `ruff`: Runs linting and formatting\n- `pr-test`: Runs an e2e podcast test on the PR\n- `build-and-push`: Builds and pushes a new container image to the remote repo. This is used to update production deployments\n\n## Production Deployment\nFor production deployment:\n```bash\nmake prod\n```\n\nThis uses the remote Docker Compose configuration and pulls pre-built images from the registry.\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Run tests: `python tests/test.py \u003cpdf1\u003e \u003cpdf2\u003e`\n5. Run linting: `make ruff`\n6. Submit a pull request","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrevdev%2Fpdf2podcast","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrevdev%2Fpdf2podcast","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrevdev%2Fpdf2podcast/lists"}