{"id":23084139,"url":"https://github.com/echogarden-project/echogarden","last_synced_at":"2025-05-15T01:08:01.595Z","repository":{"id":154096790,"uuid":"630265903","full_name":"echogarden-project/echogarden","owner":"echogarden-project","description":"Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.","archived":false,"fork":false,"pushed_at":"2025-05-04T06:15:22.000Z","size":1799,"stargazers_count":359,"open_issues_count":36,"forks_count":39,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-05-04T07:19:17.083Z","etag":null,"topics":["command-line","forced-alignment","language-detection","language-identification","node-js","source-separation","speech","speech-alignment","speech-recognition","speech-synthesis","speech-to-text","speech-translation","text-to-speech","voice-isolation"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/echogarden-project.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"docs/Contributing.md","funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-04-20T02:41:47.000Z","updated_at":"2025-05-04T06:06:30.000Z","dependencies_parsed_at":"2024-10-04T12:53:51.655Z","dependency_job_id":"2fb31658-6f62-4579-8bb0-8aa4ac4f992b","html_url":"https://github.com/echogarden-project/echogarden","commit_stats":{"total_commits":801,"total_committers":2,"mean_commits":400.5,"dds":"0.0012484394506866447","last_synced_commit":"d9c9c40d29e4a9b87bfb8b027fc93cf667077dd8"},"previous_names":[],"tags_count":63,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echogarden-project%2Fechogarden","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echogarden-project%2Fechogarden/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echogarden-project%2Fechogarden/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echogarden-project%2Fechogarden/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/echogarden-project","download_url":"https://codeload.github.com/echogarden-project/echogarden/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253127939,"owners_count":21858351,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["command-line","forced-alignment","language-detection","language-identification","node-js","source-separation","speech","speech-alignment","speech-recognition","speech-synthesis","speech-to-text","speech-translation","text-to-speech","voice-isolation"],"created_at":"2024-12-16T15:49:15.090Z","updated_at":"2025-05-15T01:07:56.579Z","avatar_url":"https://github.com/echogarden-project.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Echogarden\n\nEchogarden is an easy-to-use speech toolset that includes a variety of speech processing tools.\n\n* Easy to install, run, and update\n* Written in TypeScript, for the Node.js runtime\n* Can be used either as a command-line utility, or imported as a standard `npm` package\n* Runs on Windows (x64, ARM64), macOS (x64, ARM64) and Linux (x64, ARM64)\n* Doesn't require Python, Docker, or other system-level dependencies\n* Doesn't rely on essential platform-specific binaries. Engines are either written in pure TypeScript, ported via WebAssembly, or imported using the [ONNX runtime](https://onnxruntime.ai/)\n* Fully open-source (GPL v3)\n\n## Features\n\n* **Text-to-speech** using high-quality [Kokoro](https://github.com/hexgrad/kokoro) and [VITS](https://github.com/jaywalnut310/vits) offline models for many languages and dialects, and [16 other offline and online engines](docs/Engines.md), including cloud services by [Google](https://cloud.google.com/text-to-speech), [Microsoft](https://azure.microsoft.com/en-us/products/ai-services/text-to-speech/), [Amazon](https://aws.amazon.com/polly/), [OpenAI](https://platform.openai.com/) and [ElevenLabs](https://elevenlabs.io/)\n* **Speech-to-text** using a custom TypeScript/ONNX port of the [OpenAI Whisper](https://openai.com/research/whisper) speech recognition architecture, [whisper.cpp](https://github.com/ggerganov/whisper.cpp), and [several other engines](docs/Engines.md), including cloud services by [Google](https://cloud.google.com/speech-to-text), [Microsoft](https://azure.microsoft.com/en-us/products/ai-services/speech-to-text/), [Amazon](https://aws.amazon.com/transcribe/) and [OpenAI](https://platform.openai.com/)\n* **Speech-to-transcript alignment** using several variants of [dynamic time warping](https://en.wikipedia.org/wiki/Dynamic_time_warping) (DTW, DTW-RA), including support for multi-pass (hierarchical) processing, or via guided decoding using Whisper recognition models. Supports 100+ languages\n* **Speech-to-text translation**, translates speech in any of the [98 languages](https://platform.openai.com/docs/guides/speech-to-text/supported-languages) supported by Whisper, to English, with near word-level timing for the translated transcript\n* **Speech-to-translated-transcript alignment** synchronizes spoken audio in one language, to a provided English-translated transcript, using the Whisper engine\n* **Speech-to-transcript-and-translation alignment** synchronizes spoken audio in one language, to a translation in a variety of other languages, given both a transcript and its translation\n* **Text-to-text translation**, translates text between various languages. Supports cloud-based Google Translate engine\n* **Language detection** identifies the language of a given audio or text. Includes Whisper or [Silero](https://github.com/snakers4/silero-vad/wiki/Other-Models) engines for spoken audio, and [TinyLD](https://www.npmjs.com/package/tinyld) or [FastText](https://github.com/facebookresearch/fastText) for text\n* **Voice activity detection** attempts to identify segments of audio where voice is active or inactive. Includes [WebRTC VAD](https://github.com/dpirch/libfvad), [Silero VAD](https://github.com/snakers4/silero-vad), [RNNoise-based VAD](https://github.com/xiph/rnnoise) and a built-in Adaptive Gate algorithm\n* **Speech denoising** attenuates background noise from spoken audio. Includes the [RNNoise](https://github.com/xiph/rnnoise) and [NSNet2](https://github.com/NeonGeckoCom/nsnet2-denoiser) engines\n* **Source separation** isolates voice from any music or background ambience. Includes the [MDX-NET](https://github.com/kuielab/mdx-net) deep learning architecture\n* **Word-level timestamps** for all recognition, synthesis, alignment and translation outputs\n* Advanced **subtitle generation**, accounting for sentence and phrase boundaries\n* For the Kokoro, VITS and eSpeak-NG synthesis engines, includes **enhancements to improve TTS pronunciation accuracy**: adds text normalization (e.g. idiomatic date and currency pronunciation), English heteronym disambiguation (based on a simple rule-based model), various pronunciation corrections, and accepts user-provided pronunciation lexicons\n* **Internal package system** that auto-downloads and installs voices, models and other resources, as needed\n\n## Installation\n\nEnsure you have [Node.js](https://nodejs.org/) `v18` or later installed (`v22` or later is recommended).\n\nthen:\n```bash\nnpm install -g echogarden@latest\n```\n\n## Update\n\nSimple, but may not always update to the very latest major version:\n```\nnpm update -g echogarden\n```\n\nYou can also use [`npm-check-updates`](https://www.npmjs.com/package/npm-check-updates) to check for a newer version:\n```bash\nnpm install -g npm-check-updates\nncu -g echogarden\n```\nThen, if an updated version is available, use the command line `ncu` provides to make the update.\n\n## Using the command-line interface\n\nA small sample of command lines:\n```bash\nechogarden speak \"Hello World!\"\nechogarden speak-file story.txt --engine=kokoro\nechogarden transcribe speech.mp3\nechogarden translate-speech speech.webm subtitles.srt\nechogarden align speech.opus transcript.txt\nechogarden isolate speech.wav\n```\n\nSee the [command-line interface guide](docs/CLI.md) for more details on the operations supported, and the [configuration options reference](docs/Options.md) for a comprehensive list of all options supported.\n\n## Using the Node.js API\n\nIf you are a developer, you can also [directly import the package as a dependency](docs/API.md) in your code. The API operations and options closely mirror the CLI.\n\n## Documentation and guides\n\n* [Quick guide to the command-line interface](docs/CLI.md)\n* [Options reference](docs/Options.md)\n* [Full list of all available engines](docs/Engines.md)\n* [Node.js API reference](docs/API.md)\n* [Enabling the CUDA ONNX execution provider](docs/CUDA.md)\n* [Technical overview and Q\u0026A](docs/Technical.md)\n* [How to help](docs/Contributing.md)\n* [Setting up a development environment](docs/Development.md)\n* [Developer's task list](docs/Tasklist.md)\n* [Release notes (for releases up to `1.0.0`)](docs/Releases.md)\n\n## Credits\n\nThis project consolidates, and builds upon the effort of many different individuals and companies, as well as contributing a number of original works.\n\nDeveloped by Rotem Dan (IPA: /ˈʁɒːtem ˈdän/).\n\n## License\n\nGNU General Public License v3\n\nLicenses for components, models and other dependencies are detailed on [this page](docs/Licenses.md).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fechogarden-project%2Fechogarden","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fechogarden-project%2Fechogarden","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fechogarden-project%2Fechogarden/lists"}