{"id":17694883,"url":"https://github.com/hbmartin/podcast-transcript-convert","last_synced_at":"2026-01-19T16:32:51.968Z","repository":{"id":248009931,"uuid":"827543439","full_name":"hbmartin/podcast-transcript-convert","owner":"hbmartin","description":"Convert podcast transcripts from HTML, SRT, WebVtt, Podlove etc into PodcastIndex JSON.","archived":false,"fork":false,"pushed_at":"2024-08-12T21:56:37.000Z","size":341,"stargazers_count":2,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-11-27T19:17:55.558Z","etag":null,"topics":["file-conversion","file-converter","podcast","podcastindex","podlove","srt","srt-subtitles","transcript","webvtt","webvtt-subtitles"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hbmartin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-11T21:27:59.000Z","updated_at":"2025-11-19T14:30:25.000Z","dependencies_parsed_at":"2024-07-11T22:18:25.406Z","dependency_job_id":"20cdb121-dd07-49d6-976e-cbc454f2663d","html_url":"https://github.com/hbmartin/podcast-transcript-convert","commit_stats":null,"previous_names":["hbmartin/podcast-transcript-convert"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/hbmartin/podcast-transcript-convert","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hbmartin%2Fpodcast-transcript-convert","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hbmartin%2Fpodcast-transcript-convert/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hbmartin%2Fpodcast-transcript-convert/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hbmartin%2Fpodcast-transcript-convert/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hbmartin","download_url":"https://codeload.github.com/hbmartin/podcast-transcript-convert/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hbmartin%2Fpodcast-transcript-convert/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28574363,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-19T16:29:19.148Z","status":"ssl_error","status_checked_at":"2026-01-19T16:29:17.772Z","response_time":67,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["file-conversion","file-converter","podcast","podcastindex","podlove","srt","srt-subtitles","transcript","webvtt","webvtt-subtitles"],"created_at":"2024-10-24T13:50:17.529Z","updated_at":"2026-01-19T16:32:51.953Z","avatar_url":"https://github.com/hbmartin.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# podcast-transcript-convert\n\n[![PyPI](https://img.shields.io/pypi/v/podcast-transcript-convert.svg)](https://pypi.org/project/podcast-transcript-convert/)\n[![Lint and Test](https://github.com/hbmartin/podcast-transcript-convert/actions/workflows/lint.yml/badge.svg)](https://github.com/hbmartin/podcast-transcript-tools/actions/workflows/lint.yml)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n[![Code style: black](https://img.shields.io/badge/🐧️-black-000000.svg)](https://github.com/psf/black)\n[![Checked with pytype](https://img.shields.io/badge/🦆-pytype-437f30.svg)](https://google.github.io/pytype/)\n[![twitter](https://img.shields.io/badge/@hmartin-00aced.svg?logo=twitter\u0026logoColor=black)](https://twitter.com/hmartin)\n\n\u003cimg src=\".idea/icon.svg\" width=\"100\" align=\"right\"\u003e\n\nConvert podcast transcripts from HTML, SRT, WebVtt, Podlove etc into [PodcastIndex JSON](https://github.com/Podcastindex-org/podcast-namespace/blob/main/transcripts/transcripts.md).\n\n## Installation\n\nIt is recommended to use [pipx](https://pipx.pypa.io/stable/) to install and run the CLI tool. If you wish to use the library, you can install with `pip` instead.\n\n```bash\nbrew install pipx\npipx install podcast-transcript-convert\n```\n\nIf you've already installed the package and wish to upgrade:\n\n```bash\npipx upgrade podcast-transcript-convert\n```\n\n## Usage\nRun the conversion app on your transcripts directory.\n\n```bash\ntranscript2json transcripts/ converted/\n```\nYou can then inspect the output JSON files in the `converted/` directory.\n\n## Library Usage\n```python\nfrom podcast_transcript_convert.convert import bulk_convert\n\nbulk_convert(\"transctipts_dir/\", \"converted_dir/\")\n```\n\nIndividual file type converters are in the `converters` package. You can use them directly if you know the file type.\n\nYou can use `file_typing.identify_file_type(file)` to determine the file type of a transcript file.\n\n\n## Development\n\nPull requests are very welcome! For major changes, please open an issue first to discuss what you would like to change.\n\n```bash\ngit clone git@github.com:hbmartin/podcast-transcript-convert.git\ncd podcast-transcript-convert\npython3 -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\n# Replace with the actual path to your transcript files\npython -m podcast_transcript_convert ~/Downloads/overcast-to-sqlite/archive/transcripts converted/\n```\n\n### Code Formatting\n\nThis project is linted with [ruff](https://docs.astral.sh/ruff/) and uses [Black](https://github.com/ambv/black) code formatting.\n\n\n## Authors\n- [Harold Martin](https://www.linkedin.com/in/harold-martin-98526971/) - harold.martin at gmail\n- Icon courtesy of [Vecteezy.com](https://www.vecteezy.com)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhbmartin%2Fpodcast-transcript-convert","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhbmartin%2Fpodcast-transcript-convert","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhbmartin%2Fpodcast-transcript-convert/lists"}