{"id":16607358,"url":"https://github.com/tristan-mcinnis/simultaneous-interpretation","last_synced_at":"2026-04-10T12:05:19.754Z","repository":{"id":247364432,"uuid":"825653400","full_name":"tristan-mcinnis/Simultaneous-Interpretation","owner":"tristan-mcinnis","description":"￼Simultaneous-Interpretation is an advanced tool for real-time simultaneous interpretation. It transcribes and translates spoken language from a microphone input instantaneously, continually refining translations for accuracy. Ideal for business meetings, educational settings, and live events, it enhances multilingual communication effortlessly.","archived":false,"fork":false,"pushed_at":"2024-07-08T10:39:17.000Z","size":11,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-08T14:34:15.975Z","etag":null,"topics":["agents","asr","faster-whisper","openai","pyaudio","simultaneous-intepreting","simultaneous-translation","speech-recognition","speech-to-text","transcription","translation","whisper"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tristan-mcinnis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-08T08:48:22.000Z","updated_at":"2024-07-08T10:41:07.000Z","dependencies_parsed_at":"2024-07-08T11:11:43.614Z","dependency_job_id":null,"html_url":"https://github.com/tristan-mcinnis/Simultaneous-Interpretation","commit_stats":null,"previous_names":["nexuslux/simultaneous-interpretation","tristan-mcinnis/simultaneous-interpretation"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tristan-mcinnis%2FSimultaneous-Interpretation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tristan-mcinnis%2FSimultaneous-Interpretation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tristan-mcinnis%2FSimultaneous-Interpretation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tristan-mcinnis%2FSimultaneous-Interpretation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tristan-mcinnis","download_url":"https://codeload.github.com/tristan-mcinnis/Simultaneous-Interpretation/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242774393,"owners_count":20183073,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agents","asr","faster-whisper","openai","pyaudio","simultaneous-intepreting","simultaneous-translation","speech-recognition","speech-to-text","transcription","translation","whisper"],"created_at":"2024-10-12T01:13:23.196Z","updated_at":"2025-12-31T00:53:24.789Z","avatar_url":"https://github.com/tristan-mcinnis.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Simultaneous-Interpretation\n\nSimultaneous-Interpretation is a command-line toolkit for building real-time interpreting workflows. The project couples\nmicrophone capture, rapid transcription, neural translation, and optional speech synthesis into a modular pipeline so you can\nmix and match components that fit your environment.\n\n## Highlights\n- **Real-time transcription with whisper.cpp** – default backend uses the `whispercpp` Python bindings so you can run highly\n  optimized Whisper models locally on CPU (perfect for macOS laptops). A `faster-whisper` fallback is available for systems that\n  already rely on those weights.\n- **Flexible OpenAI translation** – select from the latest OpenAI releases (including `gpt-5`, `gpt-5-mini`, `gpt-5-nano`, `gpt-4o`, and\n  `gpt-4o-mini`) via a command-line argument. The translation step respects conversation topic hints and remembers recent context for\n  smoother phrasing.\n- **Optional text-to-speech playback** – stream translations back through your speakers using the OpenAI text-to-speech API with your\n  preferred voice, speed, and any of the current models (`gpt-4o-mini-tts`, `gpt-4.1-tts`, or `gpt-4.1-mini-tts`).\n- **Domain dictionaries and logging** – import custom terminology mappings, review transcripts in the terminal, and export a tidy log\n  to your Downloads folder after each session.\n\n\u003e **Note:** The CLI enforces the documented OpenAI model catalog. Update `src/siminterp/openai_models.py` when OpenAI publishes additional\n\u003e identifiers.\n\n## Prerequisites\n- Python 3.10 or newer.\n- [PortAudio](http://www.portaudio.com/) runtime for PyAudio. On macOS you can install it with `brew install portaudio`.\n- An OpenAI API key stored in your environment (set `OPENAI_API_KEY` or create a `.env` file).\n- Whisper model weights for whisper.cpp. Download a `.bin` model (for example `ggml-base.en.bin`) from the official\n  whisper.cpp repository and place it somewhere accessible.\n\n## Installation\n\n### Option 1 – quick setup with [uv](https://github.com/astral-sh/uv)\n[`uv`](https://github.com/astral-sh/uv) is a drop-in replacement for `pip` and `virtualenv` that installs\ndependencies in parallel. It works great on macOS (our primary development platform) and dramatically speeds up\nenvironment creation.\n\nInstall `uv` (macOS/Linux shell script):\n```bash\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n```\n\nThen bootstrap the project:\n```bash\ngit clone https://github.com/yourusername/simultaneous-interpretation.git\ncd simultaneous-interpretation\nuv venv\nsource .venv/bin/activate  # On Windows use `.venv\\\\Scripts\\\\activate`\nuv pip install -r requirements.txt\nexport PYTHONPATH=\"$PWD/src\"\n```\n\n### Option 2 – standard `venv` + `pip`\n```bash\ngit clone https://github.com/yourusername/simultaneous-interpretation.git\ncd simultaneous-interpretation\npython -m venv .venv\nsource .venv/bin/activate  # On Windows use `.venv\\\\Scripts\\\\activate`\npip install -r requirements.txt\nexport PYTHONPATH=\"$PWD/src\"\n```\n\n## Configuration\nCreate an `.env` file or export environment variables before running the CLI:\n```bash\nOPENAI_API_KEY=sk-...\n```\nOptional flags:\n- `--dictionary /path/to/file.txt` to inject custom term mappings.\n- `--topic \"Quarterly finance review\"` to bias translations toward your subject.\n- `--log-file ~/Documents/interpreter.log` to change where the rolling log is written.\n\n## Usage\nList available microphones and speakers:\n```bash\npython -m siminterp --list-devices\n```\n\nStart an interpreting session (example translates from English to French, plays back audio, and uses a custom whisper.cpp model file):\n```bash\npython -m siminterp \\\n  --input-language en \\\n  --target-language fr \\\n  --input-device 1 \\\n  --output-device 3 \\\n  --translate \\\n  --tts \\\n  --model gpt-5-mini \\\n  --tts-model gpt-4.1-mini-tts \\\n  --voice alloy \\\n  --whisper-model ~/Models/ggml-base.en.bin\n```\nPress `CTRL+C` to stop. The application gracefully shuts down background workers and stores a timestamped transcript in your\nDownloads folder.\n\n### Choosing a transcription backend\nThe CLI defaults to `--transcriber whispercpp`. If you prefer the legacy `faster-whisper` workflow you can swap with:\n```bash\npython -m siminterp --transcriber faster-whisper --whisper-model medium\n```\n\n### CLI reference\nRun `python -m siminterp --help` to view the full list of options, including:\n- `--temperature` to adjust translation creativity.\n- `--history` to control how many previous translations are shared as context.\n- `--tts-speed` to fine-tune playback speed.\n- `--phrase-time-limit` and `--ambient-duration` to tailor microphone capture windows.\n\n### Supported OpenAI models\n\nTranslation (`--model`):\n- `gpt-5` – full capability flagship model for the highest quality output.\n- `gpt-5-mini` – balanced performance and cost; ideal default for real-time interpreting.\n- `gpt-5-nano` – lightweight variant when you need ultra-low latency or lower usage costs.\n- `gpt-4o` – previous generation flagship still compatible with the workflow.\n- `gpt-4o-mini` – economical GPT-4o option.\n\nText-to-speech (`--tts-model`):\n- `gpt-4o-mini-tts` – versatile streaming synthesis with broad voice coverage.\n- `gpt-4.1-tts` – highest quality neural voice rendering.\n- `gpt-4.1-mini-tts` – faster, cost-effective voice synthesis for extended sessions.\n\nWhen OpenAI introduces additional identifiers you can extend these lists in `src/siminterp/openai_models.py` so they appear automatically in the CLI and configuration defaults.\n\n## Custom dictionary format\n```\nterm1=translation1\nterm2=translation2\n...\n```\nThese replacements occur before translation, allowing you to enforce company-specific nomenclature or names.\n\n## Logging \u0026 exports\nSession transcripts and translations stream to the console via Rich and are appended to `logfile.txt` (configurable). When you\nexit, the combined transcript is written to your Downloads directory with separate sections for source and translated text.\n\n## Troubleshooting tips\n- Ensure the whisper.cpp model path matches the actual filename (e.g., `ggml-base.en.bin`).\n- If PyAudio cannot find devices on macOS, open *System Preferences → Security \u0026 Privacy → Microphone* and grant terminal access.\n- For the fastest start-up, pre-download OpenAI models you plan to use so you can quickly swap via the `--model` argument when updates arrive.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftristan-mcinnis%2Fsimultaneous-interpretation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftristan-mcinnis%2Fsimultaneous-interpretation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftristan-mcinnis%2Fsimultaneous-interpretation/lists"}