{"id":44571370,"url":"https://github.com/kingjethro999/silero-test","last_synced_at":"2026-02-14T03:00:20.585Z","repository":{"id":303530008,"uuid":"1015783590","full_name":"kingjethro999/silero-test","owner":"kingjethro999","description":"Made Silero Hostable for api requests","archived":false,"fork":false,"pushed_at":"2025-07-08T03:39:55.000Z","size":65,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-08T06:08:53.176Z","etag":null,"topics":["api","fastapi","llm","stt","tts","tts-api"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kingjethro999.png","metadata":{"files":{"readme":"README.md","changelog":"changelog.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"open_collective":"open_stt"}},"created_at":"2025-07-08T03:21:52.000Z","updated_at":"2025-07-08T03:39:58.000Z","dependencies_parsed_at":"2025-07-08T06:08:57.138Z","dependency_job_id":"5e531f9a-fb94-4962-80dd-0c98d88c0f1f","html_url":"https://github.com/kingjethro999/silero-test","commit_stats":null,"previous_names":["kingjethro999/silero-test"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/kingjethro999/silero-test","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingjethro999%2Fsilero-test","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingjethro999%2Fsilero-test/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingjethro999%2Fsilero-test/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingjethro999%2Fsilero-test/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kingjethro999","download_url":"https://codeload.github.com/kingjethro999/silero-test/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingjethro999%2Fsilero-test/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29433297,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-14T02:20:56.896Z","status":"ssl_error","status_checked_at":"2026-02-14T02:11:29.478Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","fastapi","llm","stt","tts","tts-api"],"created_at":"2026-02-14T03:00:15.699Z","updated_at":"2026-02-14T03:00:20.570Z","avatar_url":"https://github.com/kingjethro999.png","language":"Jupyter Notebook","funding_links":["https://opencollective.com/open_stt"],"categories":[],"sub_categories":[],"readme":" [![Mailing list : test](http://img.shields.io/badge/Email-gray.svg?style=for-the-badge\u0026logo=gmail)](mailto:hello@silero.ai) [![Mailing list : test](http://img.shields.io/badge/Telegram-blue.svg?style=for-the-badge\u0026logo=telegram)](https://t.me/silero_speech) [![License: CC BY-NC 4.0](https://img.shields.io/badge/License-CC%20BY--NC%204.0-lightgrey.svg?style=for-the-badge)](https://github.com/snakers4/silero-models/blob/master/LICENSE)\n\n[![Donations](https://opencollective.com/open_stt/tiers/donation/badge.svg?label=donations\u0026color=brightgreen)](https://opencollective.com/open_stt)\n[![Backers](https://opencollective.com/open_stt/tiers/backer/badge.svg?label=backers\u0026color=brightgreen)](https://opencollective.com/open_stt)\n[![Sponsors](https://opencollective.com/open_stt/tiers/sponsor/badge.svg?label=sponsors\u0026color=brightgreen)](https://opencollective.com/open_stt)\n\n[![Build and Deploy to PyPI](https://github.com/snakers4/silero-models/actions/workflows/build_deploy.yml/badge.svg)](https://github.com/snakers4/silero-models/actions/workflows/build_deploy.yml) [![PyPI version](https://badge.fury.io/py/silero.svg)](https://badge.fury.io/py/silero)\n\n![header](https://user-images.githubusercontent.com/12515440/89997349-b3523080-dc94-11ea-9906-ca2e8bc50535.png)\n\n- [Silero Models](#silero-models)\n  - [Installation and Basics](#installation-and-basics)\n  - [Speech-To-Text](#speech-to-text)\n    - [Dependencies](#dependencies)\n    - [PyTorch](#pytorch)\n    - [ONNX](#onnx)\n    - [TensorFlow](#tensorflow)\n  - [Text-To-Speech](#text-to-speech)\n    - [Models and Speakers](#models-and-speakers)\n    - [Dependencies](#dependencies-1)\n    - [PyTorch](#pytorch-1)\n    - [Standalone Use](#standalone-use)\n    - [SSML](#SSML)\n    - [Cyrillic languages](#cyrillic-languages)\n    - [Indic languages](#indic-languages)\n  - [Text-Enhancement](#text-enhancement)\n    - [Dependencies](#dependencies-2)\n    - [Standalone Use](#standalone-use-1)\n  - [Denoise](#denoise)\n    - [Models](#models)\n    - [Dependencies](#dependencies-3)\n    - [PyTorch](#pytorch-3)\n    - [Standalone Use](#standalone-use-2)\n  - [FAQ](#faq)\n    - [Wiki](#wiki)\n    - [Performance and Quality](#performance-and-quality)\n    - [Adding new Languages](#adding-new-languages)\n  - [Contact](#contact)\n    - [Get in Touch](#get-in-touch)\n    - [Commercial Inquiries](#commercial-inquiries)\n  - [Citations](#citations)\n  - [Further reading](#further-reading)\n    - [English](#english)\n    - [Chinese](#chinese)\n    - [Russian](#russian)\n  - [Donations](#donations)\n\n# Silero Models\n\nSilero Models: pre-trained enterprise-grade STT / TTS models and benchmarks.\n\nEnterprise-grade STT made refreshingly simple (seriously, see [benchmarks](https://github.com/snakers4/silero-models/wiki/Quality-Benchmarks)).\nWe provide quality comparable to Google's STT (and sometimes even better) and we are not Google.\n\nAs a bonus:\n\n- No Kaldi;\n- No compilation;\n- No 20-step instructions;\n\nAlso we have published TTS models that satisfy the following criteria:\n\n- One-line usage;\n- A large library of voices;\n- A fully end-to-end pipeline;\n- Natural-sounding speech;\n- No GPU or training required;\n- Minimalism and lack of dependencies;\n- Faster than real-time on one CPU thread (!!!);\n- Support for 16kHz and 8kHz out of the box;\n\nAlso we have published a model for text repunctuation and recapitalization that:\n\n- Inserts capital letters and basic punctuation marks, e.g., dots, commas, hyphens, question marks, exclamation points, and dashes (for Russian);\n- Works for 4 languages (Russian, English, German, and Spanish) and can be extended;\n- Domain-agnostic by design and not based on any hard-coded rules;\n- Has non-trivial metrics and succeeds in the task of improving text readability;\n\n## Installation and Basics\n\nYou can basically use our models in 3 flavours:\n\n- Via PyTorch Hub: `torch.hub.load()`;\n- Via pip:  `pip install silero` and then `import silero`;\n- Via caching the required models and utils manually and modifying if necessary;\n\nModels are downloaded on demand both by pip and PyTorch Hub. If you need caching, do it manually or via invoking a necessary model once (it will be downloaded to a cache folder). Please see these [docs](https://pytorch.org/docs/stable/hub.html#loading-models-from-hub) for more information.\n\nPyTorch Hub and pip package are based on the same code. All of the `torch.hub.load` examples can be used with the pip package via this basic change:\n\n```python3\n# before\ntorch.hub.load(repo_or_dir='snakers4/silero-models',\n               model='silero_stt',  # or silero_tts or silero_te\n               **kwargs)\n\n# after\nfrom silero import silero_stt, silero_tts, silero_te\nsilero_stt(**kwargs)\n```\n\n## Speech-To-Text\n\nAll of the provided models are listed in the [models.yml](https://github.com/snakers4/silero-models/blob/master/models.yml) file.\nAny metadata and newer versions will be added there.\n\n![Screenshot_1](https://user-images.githubusercontent.com/36505480/132320823-f0c5b774-44f7-4375-9c46-3acbcc548b76.png)\n\nCurrently we provide the following checkpoints:\n\n|                     | PyTorch            | ONNX               | Quantization       | Quality                                                                         | Colab                                                                                                                                                                    |\n| ------------------- | ------------------ | ------------------ | ------------------ | ------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |\n| English (`en_v6`)   | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | [link](https://github.com/snakers4/silero-models/wiki/Quality-Benchmarks#en-v6) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb) |\n| English (`en_v5`)   | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | [link](https://github.com/snakers4/silero-models/wiki/Quality-Benchmarks#en-v5) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb) |\n| German (`de_v4`)    | :heavy_check_mark: | :heavy_check_mark: | :hourglass:        | [link](https://github.com/snakers4/silero-models/wiki/Quality-Benchmarks#de-v4) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb) |\n| English (`en_v3`)   | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | [link](https://github.com/snakers4/silero-models/wiki/Quality-Benchmarks#en-v3) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb) |\n| German (`de_v3`)    | :heavy_check_mark: | :hourglass:        | :hourglass:        | [link](https://github.com/snakers4/silero-models/wiki/Quality-Benchmarks#de-v3) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb) |\n| German (`de_v1`)    | :heavy_check_mark: | :heavy_check_mark: | :hourglass:        | [link](https://github.com/snakers4/silero-models/wiki/Quality-Benchmarks#de-v1) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb) |\n| Spanish (`es_v1`)   | :heavy_check_mark: | :heavy_check_mark: | :hourglass:        | [link](https://github.com/snakers4/silero-models/wiki/Quality-Benchmarks#es-v1) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb) |\n| Ukrainian (`ua_v3`) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | N/A                                                                             | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb) |\n\nModel flavours:\n\n|                   | jit                | jit                | jit                | jit                | jit_q              | jit_q              | onnx               | onnx               | onnx               | onnx               |\n| ----------------- | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ |\n|                   | xsmall             | small              | large              | xlarge             | xsmall             | small              | xsmall             | small              | large              | xlarge             |\n| English `en_v6`   |                    | :heavy_check_mark: |                    | :heavy_check_mark: |                    | :heavy_check_mark: |                    | :heavy_check_mark: |                    | :heavy_check_mark: |\n| English `en_v5`   |                    | :heavy_check_mark: |                    | :heavy_check_mark: |                    | :heavy_check_mark: |                    | :heavy_check_mark: |                    | :heavy_check_mark: |\n| English `en_v4_0` |                    |                    | :heavy_check_mark: |                    |                    |                    |                    |                    | :heavy_check_mark: |                    |\n| English `en_v3`   | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |                    | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |                    |\n| German `de_v4`    |                    |                    | :heavy_check_mark: |                    |                    |                    |                    |                    | :heavy_check_mark: |                    |\n| German `de_v3`    |                    |                    | :heavy_check_mark: |                    |                    |                    |                    |                    |                    |                    |\n| German `de_v1`    |                    | :heavy_check_mark: |                    |                    |                    |                    | :heavy_check_mark: |                    |                    |                    |\n| Spanish `es_v1`   |                    | :heavy_check_mark: |                    |                    |                    |                    | :heavy_check_mark: |                    |                    |                    |\n| Ukrainian `ua_v3` |                    | :heavy_check_mark: |                    |                    | :heavy_check_mark: |                    | :heavy_check_mark: |                    |                    |                    |\n\n### Dependencies\n\n- All examples:\n  - `torch`, 1.8+ (used to clone the repo in TensorFlow and ONNX examples), breaking changes for versions older than 1.6\n  - `torchaudio`, latest version bound to PyTorch should just work\n  - `omegaconf`, latest should just work\n- Additional dependencies for ONNX examples:\n  - `onnx`, latest should just work\n  - `onnxruntime`, latest should just work\n- Additional for TensorFlow examples:\n  - `tensorflow`, latest should just work\n  - `tensorflow_hub`, latest should just work\n\nPlease see the provided Colab for details for each example below. All examples are maintained to work with the latest major packaged versions of the installed libraries.\n\n### PyTorch\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb)\n\n[![Open on Torch Hub](https://img.shields.io/badge/Torch-Hub-red?logo=pytorch\u0026style=for-the-badge)](https://pytorch.org/hub/snakers4_silero-models_stt/)\n\n```python\nimport torch\nimport zipfile\nimport torchaudio\nfrom glob import glob\n\ndevice = torch.device('cpu')  # gpu also works, but our models are fast enough for CPU\nmodel, decoder, utils = torch.hub.load(repo_or_dir='snakers4/silero-models',\n                                       model='silero_stt',\n                                       language='en', # also available 'de', 'es'\n                                       device=device)\n(read_batch, split_into_batches,\n read_audio, prepare_model_input) = utils  # see function signature for details\n\n# download a single file in any format compatible with TorchAudio\ntorch.hub.download_url_to_file('https://opus-codec.org/static/examples/samples/speech_orig.wav',\n                               dst ='speech_orig.wav', progress=True)\ntest_files = glob('speech_orig.wav')\nbatches = split_into_batches(test_files, batch_size=10)\ninput = prepare_model_input(read_batch(batches[0]),\n                            device=device)\n\noutput = model(input)\nfor example in output:\n    print(decoder(example.cpu()))\n```\n\n### ONNX\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb)\n\nOur model will run anywhere that can import the ONNX model or that supports the ONNX runtime.\n\n```python\nimport onnx\nimport torch\nimport onnxruntime\nfrom omegaconf import OmegaConf\n\nlanguage = 'en' # also available 'de', 'es'\n\n# load provided utils\n_, decoder, utils = torch.hub.load(repo_or_dir='snakers4/silero-models', model='silero_stt', language=language)\n(read_batch, split_into_batches,\n read_audio, prepare_model_input) = utils\n\n# see available models\ntorch.hub.download_url_to_file('https://raw.githubusercontent.com/snakers4/silero-models/master/models.yml', 'models.yml')\nmodels = OmegaConf.load('models.yml')\navailable_languages = list(models.stt_models.keys())\nassert language in available_languages\n\n# load the actual ONNX model\ntorch.hub.download_url_to_file(models.stt_models.en.latest.onnx, 'model.onnx', progress=True)\nonnx_model = onnx.load('model.onnx')\nonnx.checker.check_model(onnx_model)\nort_session = onnxruntime.InferenceSession('model.onnx')\n\n# download a single file in any format compatible with TorchAudio\ntorch.hub.download_url_to_file('https://opus-codec.org/static/examples/samples/speech_orig.wav', dst ='speech_orig.wav', progress=True)\ntest_files = ['speech_orig.wav']\nbatches = split_into_batches(test_files, batch_size=10)\ninput = prepare_model_input(read_batch(batches[0]))\n\n# actual ONNX inference and decoding\nonnx_input = input.detach().cpu().numpy()\nort_inputs = {'input': onnx_input}\nort_outs = ort_session.run(None, ort_inputs)\ndecoded = decoder(torch.Tensor(ort_outs[0])[0])\nprint(decoded)\n```\n\n### TensorFlow\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb)\n\n**SavedModel example**\n\n```python\nimport os\nimport torch\nimport subprocess\nimport tensorflow as tf\nimport tensorflow_hub as tf_hub\nfrom omegaconf import OmegaConf\n\nlanguage = 'en' # also available 'de', 'es'\n\n# load provided utils using torch.hub for brevity\n_, decoder, utils = torch.hub.load(repo_or_dir='snakers4/silero-models', model='silero_stt', language=language)\n(read_batch, split_into_batches,\n read_audio, prepare_model_input) = utils\n\n# see available models\ntorch.hub.download_url_to_file('https://raw.githubusercontent.com/snakers4/silero-models/master/models.yml', 'models.yml')\nmodels = OmegaConf.load('models.yml')\navailable_languages = list(models.stt_models.keys())\nassert language in available_languages\n\n# load the actual tf model\ntorch.hub.download_url_to_file(models.stt_models.en.latest.tf, 'tf_model.tar.gz')\nsubprocess.run('rm -rf tf_model \u0026\u0026 mkdir tf_model \u0026\u0026 tar xzfv tf_model.tar.gz -C tf_model',  shell=True, check=True)\ntf_model = tf.saved_model.load('tf_model')\n\n# download a single file in any format compatible with TorchAudio\ntorch.hub.download_url_to_file('https://opus-codec.org/static/examples/samples/speech_orig.wav', dst ='speech_orig.wav', progress=True)\ntest_files = ['speech_orig.wav']\nbatches = split_into_batches(test_files, batch_size=10)\ninput = prepare_model_input(read_batch(batches[0]))\n\n# tf inference\nres = tf_model.signatures[\"serving_default\"](tf.constant(input.numpy()))['output_0']\nprint(decoder(torch.Tensor(res.numpy())[0]))\n```\n\n## Text-To-Speech\n\n### Models and Speakers\n\nAll of the provided models are listed in the [models.yml](https://github.com/snakers4/silero-models/blob/master/models.yml) file. Any metadata and newer versions will be added there.\n\n#### V4\n\nV4 models support [SSML](https://github.com/snakers4/silero-models/wiki/SSML). Also see Colab examples for main SSML tag usage.\n\n| ID       | Speakers |Auto-stress | Language                           | SR              | Colab                                                                                                                                                                        |\n| ------------- | ----------- | ----------- |---------------------------------- | --------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `v4_ru`    | `aidar`, `baya`, `kseniya`, `xenia`, `eugene`, `random` | yes  | `ru` (Russian)   | `8000`, `24000`, `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |\n| [`v4_cyrillic`](#cyrillic-languages)   | `b_ava`, `marat_tt`, `kalmyk_erdni`...             | no   | `cyrillic` [(Avar, Tatar, Kalmyk, ...)](#cyrillic-languages)   | `8000`, `24000`, `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |\n| `v4_ua`    | `mykyta`, `random`                                        | no   | `ua` (Ukrainian) | `8000`, `24000`, `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |\n| `v4_uz`    | `dilnavoz`                                                | no   | `uz` (Uzbek)     | `8000`, `24000`, `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |\n| [`v4_indic`](#indic-languages)   | `hindi_male`, `hindi_female`, ..., `random`             | no   | `indic` [(Hindi, Telugu, ...)](#indic-languages)   | `8000`, `24000`, `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |\n\n#### V3\n\nV3 models support [SSML](https://github.com/snakers4/silero-models/wiki/SSML). Also see Colab examples for main SSML tag usage.\n\n| ID       | Speakers |Auto-stress | Language                           | SR              | Colab                                                                                                                                                                        |\n| ------------- | ----------- | ----------- |---------------------------------- | --------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `v3_en`    | `en_0`, `en_1`, ..., `en_117`, `random`                   | no   | `en` (English)   | `8000`, `24000`, `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |\n| `v3_en_indic`   | `tamil_female`, ..., `assamese_male`, `random`       | no   | `en` (English)   | `8000`, `24000`, `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |\n| `v3_de`    | `eva_k`, ..., `karlsson`, `random`                        | no   | `de` (German)    | `8000`, `24000`, `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |\n| `v3_es`    | `es_0`, `es_1`, `es_2`, `random`                          | no   | `es` (Spanish)   | `8000`, `24000`, `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |\n| `v3_fr`    | `fr_0`, ..., `fr_5`, `random`                             | no   | `fr` (French)    | `8000`, `24000`, `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |\n| [`v3_indic`](#indic-languages)   | `hindi_male`, `hindi_female`, ..., `random`             | no   | `indic` [(Hindi, Telugu, ...)](#indic-languages)   | `8000`, `24000`, `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |\n\n### Dependencies\n\nBasic dependencies for Colab examples:\n\n- `torch`, 1.10+ for v3 models/ 2.0+ for v4 models;\n- `torchaudio`, latest version bound to PyTorch should work (required only because models are hosted together with STT, not required for work);\n- `omegaconf`,  latest (can be removed as well, if you do not load all of the configs);\n\n### PyTorch\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb)\n\n[![Open on Torch Hub](https://img.shields.io/badge/Torch-Hub-red?logo=pytorch\u0026style=for-the-badge)](https://pytorch.org/hub/snakers4_silero-models_tts/)\n\n```python\n# V4\nimport torch\n\nlanguage = 'ru'\nmodel_id = 'v4_ru'\nsample_rate = 48000\nspeaker = 'xenia'\ndevice = torch.device('cpu')\n\nmodel, example_text = torch.hub.load(repo_or_dir='snakers4/silero-models',\n                                     model='silero_tts',\n                                     language=language,\n                                     speaker=model_id)\nmodel.to(device)  # gpu or cpu\n\naudio = model.apply_tts(text=example_text,\n                        speaker=speaker,\n                        sample_rate=sample_rate)\n```\n\n### Standalone Use\n\n- Standalone usage only requires PyTorch 1.10+ and the Python Standard Library;\n- Please see the detailed examples in Colab;\n\n```python\n# V4\nimport os\nimport torch\n\ndevice = torch.device('cpu')\ntorch.set_num_threads(4)\nlocal_file = 'model.pt'\n\nif not os.path.isfile(local_file):\n    torch.hub.download_url_to_file('https://models.silero.ai/models/tts/ru/v4_ru.pt',\n                                   local_file)  \n\nmodel = torch.package.PackageImporter(local_file).load_pickle(\"tts_models\", \"model\")\nmodel.to(device)\n\nexample_text = 'В недрах тундры выдры в г+етрах т+ырят в вёдра ядра кедров.'\nsample_rate = 48000\nspeaker='baya'\n\naudio_paths = model.save_wav(text=example_text,\n                             speaker=speaker,\n                             sample_rate=sample_rate)\n```\n\n### SSML\n\nCheck out our [TTS Wiki page.](https://github.com/snakers4/silero-models/wiki/SSML)\n\n### Cyrillic languages\n\nSupported tokenset:\n`!,-.:?iµöабвгдежзийклмнопрстуфхцчшщъыьэюяёђѓєіјњћќўѳғҕҗҙқҡңҥҫүұҳҷһӏӑӓӕӗәӝӟӥӧөӱӳӵӹ `\n\n| Speaker_ID   | Language        | Gender |\n| ------------ | --------------- | ------ |\n| b_ava        | Avar            | F      |\n| b_bashkir    | Bashkir         | M      |\n| b_bulb       | Bulgarian       | M      |\n| b_bulc       | Bulgarian       | M      |\n| b_che        | Chechen         | M      |\n| b_cv         | Chuvash         | M      |\n| cv_ekaterina | Chuvash         | F      |\n| b_myv        | Erzya           | M      |\n| b_kalmyk     | Kalmyk          | M      |\n| b_krc        | Karachay-Balkar | M      |\n| kz_M1        | Kazakh          | M      |\n| kz_M2        | Kazakh          | M      |\n| kz_F3        | Kazakh          | F      |\n| kz_F1        | Kazakh          | F      |\n| kz_F2        | Kazakh          | F      |\n| b_kjh        | Khakas          | F      |\n| b_kpv        | Komi-Ziryan     | M      |\n| b_lez        | Lezghian        | M      |\n| b_mhr        | Mari            | F      |\n| b_mrj        | Mari High       | M      |\n| b_nog        | Nogai           | F      |\n| b_oss        | Ossetic         | M      |\n| b_ru         | Russian         | M      |\n| b_tat        | Tatar           | M      |\n| marat_tt     | Tatar           | M      |\n| b_tyv        | Tuvinian        | M      |\n| b_udm        | Udmurt          | M      |\n| b_uzb        | Uzbek           | M      |\n| b_sah        | Yakut           | M      |\n| kalmyk_erdni | Kalmyk          | M      |\n| kalmyk_delghir | Kalmyk        | F      |\n\n### Indic languages\n\n#### Example\n\n(!!!) All input sentences should be romanized to ISO format using [`aksharamukha`](https://aksharamukha.appspot.com/python). An example for `hindi`:\n\n```python\n# V3\nimport torch\nfrom aksharamukha import transliterate\n\n# Loading model\nmodel, example_text = torch.hub.load(repo_or_dir='snakers4/silero-models',\n                                     model='silero_tts',\n                                     language='indic',\n                                     speaker='v4_indic')\n\norig_text = \"प्रसिद्द कबीर अध्येता, पुरुषोत्तम अग्रवाल का यह शोध आलेख, उस रामानंद की खोज करता है\"\nroman_text = transliterate.process('Devanagari', 'ISO', orig_text)\nprint(roman_text)\n\naudio = model.apply_tts(roman_text,\n                        speaker='hindi_male')\n```\n\n#### Supported languages\n\n| Language | Speakers | Romanization function\n-- | -- | --\nhindi      | `hindi_female`, `hindi_male`             | `transliterate.process('Devanagari', 'ISO', orig_text)`\nmalayalam  | `malayalam_female`, `malayalam_male`     |`transliterate.process('Malayalam', 'ISO', orig_text)`\nmanipuri   | `manipuri_female`                        |`transliterate.process('Bengali', 'ISO', orig_text)`\nbengali    | `bengali_female`, `bengali_male`         | `transliterate.process('Bengali', 'ISO', orig_text)`\nrajasthani | `rajasthani_female`, `rajasthani_female` | `transliterate.process('Devanagari', 'ISO', orig_text)`\ntamil      | `tamil_female`, `tamil_male`             |`transliterate.process('Tamil', 'ISO', orig_text, pre_options=['TamilTranscribe'])`\ntelugu     | `telugu_female`, `telugu_male`           | `transliterate.process('Telugu', 'ISO', orig_text)`\ngujarati   | `gujarati_female`, `gujarati_male`       | `transliterate.process('Gujarati', 'ISO', orig_text)`\nkannada    | `kannada_female`, `kannada_male`         |`transliterate.process('Kannada', 'ISO', orig_text)`\n\n## Text-Enhancement\n\n| Languages | Quantization  | Quality | Colab |\n| --------- | ------------- | ------- | ----- |\n| 'en', 'de', 'ru', 'es' | :heavy_check_mark: | [link](https://github.com/snakers4/silero-models/wiki/Quality-Benchmarks#te-models) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_te.ipynb) |\n\n### Dependencies\n\nBasic dependencies for Colab examples:\n\n- `torch`, 1.9+;\n- `pyyaml`, but it's installed with torch itself\n\n### Standalone Use\n\n- Standalone usage only requires PyTorch 1.9+ and the Python Standard Library;\n- Please see the detailed examples in [Colab](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_te.ipynb);\n\n```python\nimport torch\n\nmodel, example_texts, languages, punct, apply_te = torch.hub.load(repo_or_dir='snakers4/silero-models',\n                                                                  model='silero_te')\n\ninput_text = input('Enter input text\\n')\napply_te(input_text, lan='en')\n```\n\n## Denoise\n\nDenoise models attempt to reduce background noise along with various artefacts such as reverb, clipping, high/lowpass filters etc., while trying to preserve and/or enhance speech. They also attempt to enhance audio quality and increase sampling rate of the input up to 48kHz.\n\n### Models\n\nAll of the provided models are listed in the [models.yml](https://github.com/snakers4/silero-models/blob/master/models.yml) file.\n\n| Model | JIT | Real Input SR | Input SR | Output SR | Colab |\n| ----- | --- | ------------- | -------- | --------- | ----- |\n| `small_slow` | :heavy_check_mark: | `8000`, `16000`, `24000`, `44100`, `48000`  | `24000` | `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_denoise.ipynb) |\n| `large_fast` | :heavy_check_mark: | `8000`, `16000`, `24000`, `44100`, `48000`  | `24000` | `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_denoise.ipynb) |\n| `small_fast` | :heavy_check_mark: | `8000`, `16000`, `24000`, `44100`, `48000`  | `24000` | `48000` | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_denoise.ipynb) |\n\n### Dependencies\n\nBasic dependencies for Colab examples:\n\n- `torch`, 2.0+;\n- `torchaudio`, latest version bound to PyTorch should work;\n- `omegaconf`,  latest (can be removed as well, if you do not load all of the configs).\n\n### PyTorch\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_denoise.ipynb)\n\n```python\n\nimport torch\n\nname = 'small_slow'\ndevice = torch.device('cpu')\nmodel, samples, utils = torch.hub.load(\n  repo_or_dir='snakers4/silero-models',\n  model='silero_denoise',\n  name=name,\n  device=device)\n(read_audio, save_audio, denoise) = utils\n\ni = 0\ntorch.hub.download_url_to_file(\n  samples[i],\n  dst=f'sample{i}.wav',\n  progress=True\n)\naudio_path = f'sample{i}.wav'\naudio = read_audio(audio_path).to(device)\noutput = model(audio)\nsave_audio(f'result{i}.wav', output.squeeze(1).cpu())\n\ni = 1\ntorch.hub.download_url_to_file(\n  samples[i],\n  dst=f'sample{i}.wav',\n  progress=True\n)\noutput, sr = denoise(model, f'sample{i}.wav', f'result{i}.wav', device='cpu')\n```\n\n### Standalone Use\n\n```python\nimport os\nimport torch\n\ndevice = torch.device('cpu')\ntorch.set_num_threads(4)\nlocal_file = 'model.pt'\n\nif not os.path.isfile(local_file):\n    torch.hub.download_url_to_file('https://models.silero.ai/denoise_models/sns_latest.jit',\n                                   local_file)  \n\nmodel = torch.jit.load(local_file)\ntorch._C._jit_set_profiling_mode(False) \ntorch.set_grad_enabled(False)\nmodel.to(device)\n\na = torch.rand((1, 48000))\na = a.to(device)\nout = model(a)\n```\n\n## FAQ\n\n### Wiki\n\nAlso check out our [wiki](https://github.com/snakers4/silero-models/wiki).\n\n### Performance and Quality\n\nPlease refer to these wiki sections:\n\n- [Quality Benchmarks](https://github.com/snakers4/silero-models/wiki/Quality-Benchmarks)\n- [Performance Benchmarks](https://github.com/snakers4/silero-models/wiki/Performance-Benchmarks)\n\n### Adding new Languages\n\nPlease refer [here](https://github.com/snakers4/silero-models/wiki/Adding-New-Languages).\n\n## Contact\n\n### Get in Touch\n\nTry our models, create an [issue](https://github.com/snakers4/silero-models/issues/new), join our [chat](https://t.me/silero_speech), [email](mailto:hello@silero.ai) us, and read the latest [news](https://t.me/silero_news).\n\n### Commercial Inquiries\n\nPlease refer to our [wiki](https://github.com/snakers4/silero-models/wiki) and the [Licensing and Tiers](https://github.com/snakers4/silero-models/wiki/Licensing-and-Tiers) page for relevant information, and [email](mailto:hello@silero.ai) us.\n\n## Citations\n\n```bibtex\n@misc{Silero Models,\n  author = {Silero Team},\n  title = {Silero Models: pre-trained enterprise-grade STT / TTS models and benchmarks},\n  year = {2021},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n  howpublished = {\\url{https://github.com/snakers4/silero-models}},\n  commit = {insert_some_commit_here},\n  email = {hello@silero.ai}\n}\n```\n\n## Further reading\n\n### English\n\n- STT:\n  - Towards an Imagenet Moment For Speech-To-Text - [link](https://thegradient.pub/towards-an-imagenet-moment-for-speech-to-text/)\n  - A Speech-To-Text Practitioners Criticisms of Industry and Academia - [link](https://thegradient.pub/a-speech-to-text-practitioners-criticisms-of-industry-and-academia/)\n  - Modern Google-level STT Models Released - [link](https://habr.com/ru/post/519562/)\n\n- TTS:\n  - Multilingual Text-to-Speech Models for Indic Languages - [link](https://www.analyticsvidhya.com/blog/2022/06/multilingual-text-to-speech-models-for-indic-languages/)\n  - Our new public speech synthesis in super-high quality, 10x faster and more stable - [link](https://habr.com/ru/post/660571/)\n  - High-Quality Text-to-Speech Made Accessible, Simple and Fast - [link](https://habr.com/ru/post/549482/)\n\n- VAD:\n  - One Voice Detector to Rule Them All - [link](https://thegradient.pub/one-voice-detector-to-rule-them-all/)\n  - Modern Portable Voice Activity Detector Released - [link](https://habr.com/ru/post/537276/)\n\n- Text Enhancement:\n  - We have published a model for text repunctuation and recapitalization for four languages - [link](https://habr.com/ru/post/581960/)\n\n### Chinese\n\n- STT:\n  - 迈向语音识别领域的 ImageNet 时刻 - [link](https://www.infoq.cn/article/4u58WcFCs0RdpoXev1E2)\n  - 语音领域学术界和工业界的七宗罪 - [link](https://www.infoq.cn/article/lEe6GCRjF1CNToVITvNw)\n\n### Russian\n\n- STT\n  - OpenAI решили распознавание речи! Разбираемся так ли это … - [link](https://habr.com/ru/post/689572/)\n  - Наши сервисы для бесплатного распознавания речи стали лучше и удобнее - [link](https://habr.com/ru/post/654227/)\n  - Telegram-бот Silero бесплатно переводит речь в текст - [link](https://habr.com/ru/post/591563/)\n  - Бесплатное распознавание речи для всех желающих - [link](https://habr.com/ru/post/587512/)\n  - Последние обновления моделей распознавания речи из Silero Models - [link](https://habr.com/ru/post/577630/)\n  - Сжимаем трансформеры: простые, универсальные и прикладные способы cделать их компактными и быстрыми - [link](https://habr.com/ru/post/563778/)\n  - Ультимативное сравнение систем распознавания речи: Ashmanov, Google, Sber, Silero, Tinkoff, Yandex - [link](https://habr.com/ru/post/559640/)\n  - Мы опубликовали современные STT модели сравнимые по качеству с Google - [link](https://habr.com/ru/post/519564/)\n  - Понижаем барьеры на вход в распознавание речи - [link](https://habr.com/ru/post/494006/)\n  - Огромный открытый датасет русской речи версия 1.0 - [link](https://habr.com/ru/post/474462/)\n  - Насколько Быстрой Можно Сделать Систему STT? - [link](https://habr.com/ru/post/531524/)\n  - Наша система Speech-To-Text - [link](https://www.silero.ai/tag/our-speech-to-text/)\n  - Speech-To-Text - [link](https://www.silero.ai/tag/speech-to-text/)\n\n- TTS:\n  - Теперь наш синтез также доступен в виде бота в Телеграме - [link](https://habr.com/ru/post/682188/)\n  - Может ли синтез речи обмануть систему биометрической идентификации? - [link](https://habr.com/ru/post/673996/)\n  - Теперь наш синтез на 20 языках - [link](https://habr.com/ru/post/669910/)\n  - Теперь наш публичный синтез в супер-высоком качестве, в 10 раз быстрее и без детских болячек - [link](https://habr.com/ru/post/660565/)\n  - Синтезируем голос бабушки, дедушки и Ленина + новости нашего публичного синтеза - [link](https://habr.com/ru/post/584750/)\n  - Мы сделали наш публичный синтез речи еще лучше - [link](https://habr.com/ru/post/563484/)\n  - Мы Опубликовали Качественный, Простой, Доступный и Быстрый Синтез Речи - [link](https://habr.com/ru/post/549480/)\n\n- VAD:\n  - Наш публичный детектор голоса стал лучше - [link](https://habr.com/ru/post/695738/)\n  - А ты используешь VAD? Что это такое и зачем он нужен - [link](https://habr.com/ru/post/594745/)\n  - Модели для Детекции Речи, Чисел и Распознавания Языков - [link](https://www.silero.ai/vad-lang-classifier-number-detector/)\n  - Мы опубликовали современный Voice Activity Detector и не только -[link](https://habr.com/ru/post/537274/)\n\n- Text Enhancement:\n  - Восстановление знаков пунктуации и заглавных букв — теперь и на длинных текстах - [link](https://habr.com/ru/post/594565/)\n  - Мы опубликовали модель, расставляющую знаки препинания и заглавные буквы в тексте на четырех языках - [link](https://habr.com/ru/post/581946/)\n\n## Donations\n\nPlease use the \"sponsor\" button.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingjethro999%2Fsilero-test","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkingjethro999%2Fsilero-test","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingjethro999%2Fsilero-test/lists"}