{"id":32464826,"url":"https://github.com/xsa-dev/video_to_text","last_synced_at":"2026-05-17T19:35:11.471Z","repository":{"id":319480775,"uuid":"1078685026","full_name":"xsa-dev/video_to_text","owner":"xsa-dev","description":null,"archived":false,"fork":false,"pushed_at":"2025-10-20T20:24:23.000Z","size":7,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-26T13:29:24.788Z","etag":null,"topics":["assembly","converter","util","video-to-text","whisper-cpp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xsa-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-18T07:42:01.000Z","updated_at":"2025-10-18T14:11:21.000Z","dependencies_parsed_at":"2025-10-19T08:08:44.440Z","dependency_job_id":"4b96f9b3-08e1-40c8-a975-010c6846af89","html_url":"https://github.com/xsa-dev/video_to_text","commit_stats":null,"previous_names":["xsa-dev/video_to_text"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/xsa-dev/video_to_text","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xsa-dev%2Fvideo_to_text","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xsa-dev%2Fvideo_to_text/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xsa-dev%2Fvideo_to_text/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xsa-dev%2Fvideo_to_text/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xsa-dev","download_url":"https://codeload.github.com/xsa-dev/video_to_text/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xsa-dev%2Fvideo_to_text/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33151876,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-17T09:28:26.183Z","status":"ssl_error","status_checked_at":"2026-05-17T09:27:52.702Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["assembly","converter","util","video-to-text","whisper-cpp"],"created_at":"2025-10-26T13:21:55.423Z","updated_at":"2026-05-17T19:35:11.466Z","avatar_url":"https://github.com/xsa-dev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Video to Text Converter\n\nA simple utility that extracts audio tracks from video files and transcribes them using either AssemblyAI API or whisper-cli, saving the resulting text alongside each source video.\n\n## Installation\n\n1. Install the Python dependencies:\n```bash\npip install -r requirements.txt\n```\n\n2. Install ffmpeg (required for audio extraction):\n```bash\n# macOS\nbrew install ffmpeg\n\n# Ubuntu/Debian\nsudo apt update\nsudo apt install ffmpeg\n\n# Windows\n# Download from https://ffmpeg.org/download.html\n```\n\n3. Choose your transcription method:\n\n### Option A: Using whisper-cli (Recommended)\n1. Install whisper-cli:\n```bash\n# macOS\nbrew install whisper-cli\n\n# Or build from source: https://github.com/ggerganov/whisper.cpp\n```\n\n2. Download a whisper model (e.g., ggml-large-v3-turbo.bin) and place it in the project directory.\n\n### Option B: Using AssemblyAI API\n1. Provide your AssemblyAI API key using one of the following methods:\n```bash\n# Preferred: environment variable\nexport ASSEMBLYAI_API_KEY=\"your_api_key_here\"\n\n# Optional: add api_key to the [assemblyai] section in config.toml\n```\n\n## Configuration\n\nAll settings live in `config.toml`:\n\n- `video.paths` — list of absolute or relative paths to the video files to process\n- `transcription.method` — choose between \"whisper\" or \"assemblyai\"\n- `whisper` — whisper-cli related options (model path, language, threads, etc.)\n- `assemblyai` — AssemblyAI related options (API key, speech model, language, formatting)\n- `audio` — audio extraction parameters (sample rate, channels, codec)\n\n### Example configuration\n\n```toml\n[video]\npaths = [\n    \"/path/to/video1.mp4\",\n    \"/path/to/video2.mp4\"\n]\n\n[transcription]\n# Method: \"whisper\" or \"assemblyai\"\nmethod = \"whisper\"\n\n[whisper]\n# Path to the whisper model file\nmodel_path = \"ggml-large-v3-turbo.bin\"\n# Language code (e.g., \"ru\", \"en\", \"auto\")\nlanguage = \"ru\"\n# Number of threads\nthreads = 16\n# Suppress non-speech tokens\nsuppress_nst = true\n# Max context\nmax_context = 128\n# No prompt\nno_prompt = true\n# Best of N\nbest_of = 7\n\n[assemblyai]\n# Leave blank to rely on ASSEMBLYAI_API_KEY\napi_key = \"\"\nspeech_model = \"universal\"\nlanguage = \"en\"\npunctuate = true\nformat_text = true\n\n[audio]\nsample_rate = 16000\nchannels = 1\ncodec = \"pcm_s16le\"\n```\n\n## Usage\n\n1. Update `config.toml` with video paths and your preferred transcription method.\n2. If using AssemblyAI, obtain an API key from [AssemblyAI](https://www.assemblyai.com/) and set it as an environment variable or in the config file.\n3. If using whisper-cli, ensure the model file is in the project directory.\n4. Run the script:\n```bash\npython main.py\n```\n\n## What the script does\n\n1. Loads configuration from `config.toml`.\n2. Extracts audio from each video file to WAV (16 kHz, mono).\n3. Transcribes the audio using the selected method (whisper-cli or AssemblyAI).\n4. Saves the transcript to a `.txt` file next to the source video.\n5. Removes temporary audio files once transcription completes.\n\n## Output format\n\nFor each video file `video.mp4`, the script creates a paired transcript file `video.txt`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxsa-dev%2Fvideo_to_text","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxsa-dev%2Fvideo_to_text","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxsa-dev%2Fvideo_to_text/lists"}