{"id":15033488,"url":"https://github.com/bootphon/phonemizer","last_synced_at":"2025-05-14T07:10:46.121Z","repository":{"id":7980918,"uuid":"56728069","full_name":"bootphon/phonemizer","owner":"bootphon","description":"Simple text to phones converter for multiple languages","archived":false,"fork":false,"pushed_at":"2024-09-26T10:00:05.000Z","size":3626,"stargazers_count":1376,"open_issues_count":44,"forks_count":186,"subscribers_count":23,"default_branch":"master","last_synced_at":"2025-05-14T03:15:16.670Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://bootphon.github.io/phonemizer/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bootphon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2016-04-20T23:43:08.000Z","updated_at":"2025-05-13T12:32:11.000Z","dependencies_parsed_at":"2024-12-31T10:07:50.930Z","dependency_job_id":"20fb5633-2131-4a44-be26-279c9bf2ca17","html_url":"https://github.com/bootphon/phonemizer","commit_stats":{"total_commits":314,"total_committers":18,"mean_commits":"17.444444444444443","dds":0.6019108280254777,"last_synced_commit":"f7ce79bbb6288a823c096df450dff23d5b6eed2f"},"previous_names":[],"tags_count":17,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bootphon%2Fphonemizer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bootphon%2Fphonemizer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bootphon%2Fphonemizer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bootphon%2Fphonemizer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bootphon","download_url":"https://codeload.github.com/bootphon/phonemizer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254092798,"owners_count":22013292,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-09-24T20:21:25.899Z","updated_at":"2025-05-14T07:10:41.100Z","avatar_url":"https://github.com/bootphon.png","language":"Python","readme":"|         **Tests** | [![Linux][badge-test-linux]](https://github.com/bootphon/phonemizer/actions/workflows/linux.yaml) [![MacOS][badge-test-macos]](https://github.com/bootphon/phonemizer/actions/workflows/macos.yaml) [![Windows][badge-test-windows]](https://github.com/bootphon/phonemizer/actions/workflows/windows.yaml) [![Codecov][badge-codecov]](https://codecov.io/gh/bootphon/phonemizer) |\n|------------------:| --- |\n| **Documentation** | [![Doc](https://github.com/bootphon/phonemizer/actions/workflows/doc.yaml/badge.svg)](https://bootphon.github.io/phonemizer/) |\n|       **Release** | [![GitHub release (latest SemVer)][badge-github-version]](https://github.com/bootphon/phonemizer/releases/latest) [![PyPI][badge-pypi-version]](https://pypi.python.org/pypi/phonemizer) [![downloads][badge-pypi-downloads]](https://pypi.python.org/pypi/phonemizer) |\n|      **Citation** | [![status][badge-joss]](https://joss.theoj.org/papers/08d1ffc14f233f56942f78f3742b266e) [![DOI][badge-zenodo]](https://doi.org/10.5281/zenodo.1045825) |\n\n---\n\n# Phonemizer -- *foʊnmaɪzɚ*\n\n* The phonemizer allows simple phonemization of words and texts in many languages.\n\n* Provides both the `phonemize` command-line tool and the Python function\n  `phonemizer.phonemize`. See [the package's documentation](https://bootphon.github.io/phonemizer/).\n\n* It is based on four backends: **espeak**, **espeak-mbrola**, **festival** and\n  **segments**. The backends have different properties and capabilities resumed\n  in table below. The backend choice is let to the user.\n\n  * [espeak-ng](https://github.com/espeak-ng/espeak-ng) is a Text-to-Speech\n    software supporting a lot of languages and IPA (International Phonetic\n    Alphabet) output.\n\n  * [espeak-ng-mbrola](https://github.com/espeak-ng/espeak-ng/blob/master/docs/mbrola.md)\n    uses the SAMPA phonetic alphabet instead of IPA but does not preserve word\n    boundaries.\n\n  * [festival](http://www.cstr.ed.ac.uk/projects/festival) is another\n    Tex-to-Speech engine. Its phonemizer backend currently supports only\n    American English. It uses a [custom phoneset][festival-phoneset], but it\n    allows tokenization at the syllable level.\n\n  * [segments](https://github.com/cldf/segments) is a Unicode tokenizer that\n    build a phonemization from a grapheme to phoneme mapping provided as a file\n    by the user.\n\n  |                              | espeak                   | espeak-mbrola           | festival                    | segments           |\n  | ---:                         | ---                      | ---                     | ---                         | ---                |\n  | **phone set**                | [IPA]                    | [SAMPA]                 | [custom][festival-phoneset] | user defined       |\n  | **supported languages**      | [100+][espeak-languages] | [35][mbrola-languages] | US English                  | user defined       |\n  | **processing speed**         | fast                     | slow                    | very slow                   | fast               |\n  | **phone tokens**             | :heavy_check_mark:       | :heavy_check_mark:      | :heavy_check_mark:          | :heavy_check_mark: |\n  | **syllable tokens**          | :x:                      | :x:                     | :heavy_check_mark:          | :x:                |\n  | **word tokens**              | :heavy_check_mark:       | :x:                     | :heavy_check_mark:          | :heavy_check_mark: |\n  | **punctuation preservation** | :heavy_check_mark:       | :x:                     | :heavy_check_mark:          | :heavy_check_mark: |\n  | **stressed phones**          | :heavy_check_mark:       | :x:                     | :x:                         | :x:                |\n  | [**tie**][tie-IPA]           | :heavy_check_mark:       | :x:                     | :x:                         | :x:                |\n\n\n\n## Citation\n\nTo refenrece the `phonemizer` in your own work, please cite the following [JOSS\npaper](https://joss.theoj.org/papers/10.21105/joss.03958).\n\n```bibtex\n@article{Bernard2021,\n  doi = {10.21105/joss.03958},\n  url = {https://doi.org/10.21105/joss.03958},\n  year = {2021},\n  publisher = {The Open Journal},\n  volume = {6},\n  number = {68},\n  pages = {3958},\n  author = {Mathieu Bernard and Hadrien Titeux},\n  title = {Phonemizer: Text to Phones Transcription for Multiple Languages in Python},\n  journal = {Journal of Open Source Software}\n}\n```\n\n\n## Licence\n\n**Copyright 2015-2021 Mathieu Bernard**\n\nThis program is free software: you can redistribute it and/or modify\nit under the terms of the GNU General Public License as published by\nthe Free Software Foundation, either version 3 of the License, or\n(at your option) any later version.\n\nThis program is distributed in the hope that it will be useful,\nbut WITHOUT ANY WARRANTY; without even the implied warranty of\nMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\nGNU General Public License for more details.\n\nYou should have received a copy of the GNU General Public License\nalong with this program. If not, see \u003chttp://www.gnu.org/licenses/\u003e.\n\n\n[badge-test-linux]: https://github.com/bootphon/phonemizer/actions/workflows/linux.yaml/badge.svg?branch=master\n[badge-test-macos]: https://github.com/bootphon/phonemizer/actions/workflows/macos.yaml/badge.svg?branch=master\n[badge-test-windows]: https://github.com/bootphon/phonemizer/actions/workflows/windows.yaml/badge.svg?branch=master\n[badge-codecov]: https://img.shields.io/codecov/c/github/bootphon/phonemizer\n[badge-github-version]: https://img.shields.io/github/v/release/bootphon/phonemizer\n[badge-pypi-version]: https://img.shields.io/pypi/v/phonemizer\n[badge-pypi-downloads]: https://img.shields.io/pypi/dm/phonemizer\n[badge-joss]: https://joss.theoj.org/papers/08d1ffc14f233f56942f78f3742b266e/status.svg\n[badge-zenodo]: https://zenodo.org/badge/56728069.svg\n[phonemizer-1.0]: https://github.com/bootphon/phonemizer/releases/tag/v1.0\n[festival-phoneset]: http://www.festvox.org/bsv/c4711.html\n[IPA]: https://en.wikipedia.org/wiki/International_Phonetic_Alphabet\n[SAMPA]: https://en.wikipedia.org/wiki/SAMPA\n[phonemize-function]: https://github.com/bootphon/phonemizer/blob/c5e2f3878d6db391ec7253173f44e4a85cfe41e3/phonemizer/phonemize.py#L33-L156\n[tie-IPA]: https://en.wikipedia.org/wiki/Tie_(typography)#International_Phonetic_Alphabet\n[espeak-languages]: https://github.com/espeak-ng/espeak-ng/blob/master/docs/languages.md\n[mbrola-languages]: https://github.com/numediart/MBROLA-voices\n","funding_links":[],"categories":["Natural Language Processing","**Tools, Libraries, Models**","Python"],"sub_categories":["Others","NLP","Transliteration"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbootphon%2Fphonemizer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbootphon%2Fphonemizer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbootphon%2Fphonemizer/lists"}