{"id":30066681,"url":"https://github.com/djleamen/renamer","last_synced_at":"2026-04-12T15:52:17.057Z","repository":{"id":301788946,"uuid":"1006367389","full_name":"djleamen/renamer","owner":"djleamen","description":"Utility to rename mp3 files based on speech content","archived":false,"fork":false,"pushed_at":"2025-07-26T04:46:32.000Z","size":18,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-26T10:56:01.318Z","etag":null,"topics":["ffmpeg","google-speech-recognition","googlespeechapi","mp3","openai","pyaudio","pydub","python","speech-recognition","speech-to-text","torch","util","utility","wav","whisper","whisper-ai"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/djleamen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-22T05:25:00.000Z","updated_at":"2025-07-26T04:46:35.000Z","dependencies_parsed_at":"2025-06-28T21:19:35.731Z","dependency_job_id":"06a52679-f2c3-465c-b7ce-0dbb8610281a","html_url":"https://github.com/djleamen/renamer","commit_stats":null,"previous_names":["djleamen/renamer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/djleamen/renamer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/djleamen%2Frenamer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/djleamen%2Frenamer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/djleamen%2Frenamer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/djleamen%2Frenamer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/djleamen","download_url":"https://codeload.github.com/djleamen/renamer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/djleamen%2Frenamer/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269385801,"owners_count":24408432,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-08T02:00:09.200Z","response_time":72,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ffmpeg","google-speech-recognition","googlespeechapi","mp3","openai","pyaudio","pydub","python","speech-recognition","speech-to-text","torch","util","utility","wav","whisper","whisper-ai"],"created_at":"2025-08-08T08:00:46.723Z","updated_at":"2026-04-12T15:52:17.033Z","avatar_url":"https://github.com/djleamen.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MP3 Renamer\n\n[![Python Version](https://img.shields.io/badge/python-3.6%2B-blue)](https://www.python.org/downloads/)\n[![License](https://img.shields.io/github/license/djleamen/renamer)](LICENSE)\n![Status](https://img.shields.io/badge/status-active-brightgreen)\n[![Last Commit](https://img.shields.io/github/last-commit/djleamen/renamer)](https://github.com/djleamen/renamer/commits)\n\nA Python tool that automatically renames MP3 files based on their speech content using AI-powered Speech-to-Text technology.\n\n## Features\n\n- Scans a directory for MP3 files\n- Converts each MP3 to WAV format (temporarily)\n- Uses OpenAI's Whisper model for high-quality transcription (with Google Speech API as fallback)\n- Extracts the first sentence or a specified number of words from the transcription\n- Renames the MP3 file based on this extracted text\n- Cleans filenames to ensure they are valid\n\n## Requirements\n\n- Python 3.6 or later\n- Required Python packages (install with `pip install -r requirements.txt`):\n  - openai-whisper (for state-of-the-art speech recognition)\n  - torch (required for Whisper)\n  - SpeechRecognition (fallback recognition)\n  - pydub (audio file manipulation)\n  - PyAudio (for audio processing)\n- FFmpeg (for MP3 conversion, used by pydub)\n\n## Installation\n\n1. Clone or download this repository\n2. Install required packages:\n\n   ```bash\n   pip install -r requirements.txt\n   ```\n3. Install FFmpeg (if not already installed):\n   - macOS: `brew install ffmpeg`\n   - Linux: `apt-get install ffmpeg`\n   - Windows: Download from [ffmpeg.org](https://ffmpeg.org/download.html)\n\n## Usage\n\n### Basic Usage\n\nRun the script with the path to the directory containing MP3 files:\n\n```bash\npython mp3_renamer.py /path/to/mp3/folder\n```\n\n### Advanced Options\n\n```bash\npython mp3_renamer.py /path/to/mp3/folder [options]\n```\n\nAvailable options:\n\n- `-d, --duration SECONDS` - Process only a specific duration (default: 10 seconds)\n- `-s, --start SECONDS` - Start processing from this time (default: 0 seconds)\n- `-f, --first N` - Use the first N words for the filename instead of a sentence\n- `-v, --verbose` - Enable verbose output for debugging\n- `-e, --engine {whisper,google,both}` - Choose the speech recognition engine (default: whisper)\n- `-m, --model {tiny,base,small,medium,large}` - Select Whisper model size (default: base)\n\n### Examples\n\nUse Whisper with a larger model for better accuracy:\n\n```bash\npython mp3_renamer.py /path/to/mp3/folder --model medium\n```\n\nProcess only the first 5 seconds of each file:\n\n```bash\npython mp3_renamer.py /path/to/mp3/folder --duration 5\n```\n\nUse the first 10 words instead of trying to detect a sentence:\n\n```bash\npython mp3_renamer.py /path/to/mp3/folder --first 10\n```\n\n## How It Works\n\n1. The script scans the specified directory for MP3 files\n2. For each MP3 file:\n   - Converts it to WAV format (needed for speech recognition)\n   - Uses OpenAI's Whisper model to transcribe the audio\n   - Extracts the first sentence or specified number of words\n   - Cleans the text to make it a valid filename\n   - Renames the MP3 file with the new name\n   - Removes the temporary WAV file\n\n## Choosing a Whisper Model\n\n- `tiny`: Fastest, lowest accuracy, minimal resource usage\n- `base`: Good balance of speed and accuracy (default)\n- `small`: Better accuracy, moderate resource usage\n- `medium`: High accuracy, higher resource usage\n- `large`: Best accuracy, significant resource usage\n\n## Troubleshooting\n\nIf you encounter issues:\n\n1. Try running with `--verbose` to see detailed logs\n2. Use `--model small` or `--model medium` for better transcription\n3. Adjust the `--start` and `--duration` parameters to capture the correct part of audio\n4. Use `--first N` to bypass sentence detection if it's not working well\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdjleamen%2Frenamer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdjleamen%2Frenamer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdjleamen%2Frenamer/lists"}