{"id":13441018,"url":"https://github.com/mozillazg/python-pinyin","last_synced_at":"2025-05-13T15:11:16.627Z","repository":{"id":37734516,"uuid":"12830126","full_name":"mozillazg/python-pinyin","owner":"mozillazg","description":"汉字转拼音(pypinyin)","archived":false,"fork":false,"pushed_at":"2025-03-30T12:15:26.000Z","size":7527,"stargazers_count":5044,"open_issues_count":30,"forks_count":624,"subscribers_count":98,"default_branch":"master","last_synced_at":"2025-04-23T18:56:13.024Z","etag":null,"topics":["chinese","hanzi","hanzi-pinyin","pinyin","pypinyin","python","python2","python3"],"latest_commit_sha":null,"homepage":"https://pypinyin.readthedocs.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mozillazg.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGELOG.rst","contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"custom":["https://mozillazg.com/wechat_donate.jpeg"],"patreon":"mozillazg"}},"created_at":"2013-09-14T14:01:40.000Z","updated_at":"2025-04-23T08:19:58.000Z","dependencies_parsed_at":"2023-02-09T21:30:34.405Z","dependency_job_id":"495f3065-f554-4743-8a09-b31c0f244804","html_url":"https://github.com/mozillazg/python-pinyin","commit_stats":{"total_commits":655,"total_committers":27,"mean_commits":24.25925925925926,"dds":0.4885496183206107,"last_synced_commit":"e42dede51abbc40e225da9a8ec8e5bd0043eed21"},"previous_names":[],"tags_count":97,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mozillazg%2Fpython-pinyin","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mozillazg%2Fpython-pinyin/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mozillazg%2Fpython-pinyin/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mozillazg%2Fpython-pinyin/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mozillazg","download_url":"https://codeload.github.com/mozillazg/python-pinyin/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253969259,"owners_count":21992263,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chinese","hanzi","hanzi-pinyin","pinyin","pypinyin","python","python2","python3"],"created_at":"2024-07-31T03:01:28.949Z","updated_at":"2025-05-13T15:11:11.607Z","avatar_url":"https://github.com/mozillazg.png","language":"Python","readme":"汉字拼音转换工具（Python 版）\n=============================\n\n|Build| |GitHubAction| |Coverage| |Pypi version| |PyPI downloads| |DOI|\n\n\n将汉字转为拼音。可以用于汉字注音、排序、检索(`Russian translation`_) 。\n\n最初版本的代码参考了 `hotoo/pinyin \u003chttps://github.com/hotoo/pinyin\u003e`__ 的实现。\n\n* Documentation: https://pypinyin.readthedocs.io/\n* GitHub: https://github.com/mozillazg/python-pinyin\n* License: MIT license\n* PyPI: https://pypi.org/project/pypinyin\n* Python version: 2.7, pypy, pypy3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 3.10, 3.11, 3.12, 3.13\n\n.. contents::\n\n\n特性\n----\n\n* 根据词组智能匹配最正确的拼音。\n* 支持多音字。\n* 简单的繁体支持，注音支持，威妥玛拼音支持。\n* 支持多种不同拼音/注音风格。\n\n\n安装\n----\n\n.. code-block:: bash\n\n    pip install pypinyin\n\n\n使用示例\n--------\n\n.. code-block:: python\n\n    \u003e\u003e\u003e from pypinyin import pinyin, lazy_pinyin, Style\n    \u003e\u003e\u003e pinyin('中心')  # or pinyin(['中心'])，参数值为列表时表示输入的是已分词后的数据\n    [['zhōng'], ['xīn']]\n    \u003e\u003e\u003e pinyin('中心', heteronym=True)  # 启用多音字模式\n    [['zhōng', 'zhòng'], ['xīn']]\n    \u003e\u003e\u003e pinyin('中心', style=Style.FIRST_LETTER)  # 设置拼音风格\n    [['z'], ['x']]\n    \u003e\u003e\u003e pinyin('中心', style=Style.TONE2, heteronym=True)\n    [['zho1ng', 'zho4ng'], ['xi1n']]\n    \u003e\u003e\u003e pinyin('中心', style=Style.TONE3, heteronym=True)\n    [['zhong1', 'zhong4'], ['xin1']]\n    \u003e\u003e\u003e pinyin('中心', style=Style.BOPOMOFO)  # 注音风格\n    [['ㄓㄨㄥ'], ['ㄒㄧㄣ']]\n    \u003e\u003e\u003e lazy_pinyin('威妥玛拼音', style=Style.WADEGILES)\n    ['wei', \"t'o\", 'ma', \"p'in\", 'yin']\n    \u003e\u003e\u003e lazy_pinyin('中心')  # 不考虑多音字的情况\n    ['zhong', 'xin']\n    \u003e\u003e\u003e lazy_pinyin('战略', v_to_u=True)  # 不使用 v 表示 ü\n    ['zhan', 'lüe']\n    # 使用 5 标识轻声\n    \u003e\u003e\u003e lazy_pinyin('衣裳', style=Style.TONE3, neutral_tone_with_five=True)\n    ['yi1', 'shang5']\n    # 变调  nǐ hǎo -\u003e ní hǎo\n    \u003e\u003e\u003e lazy_pinyin('你好', style=Style.TONE2, tone_sandhi=True)\n    ['ni2', 'ha3o']\n\n**注意事项** ：\n\n* 默认情况下拼音结果不会标明哪个韵母是轻声，轻声的韵母没有声调或数字标识（可以通过参数 ``neutral_tone_with_five=True`` 开启使用 ``5`` 标识轻声 ）。\n* 默认情况下无声调相关拼音风格下的结果会使用 ``v`` 表示 ``ü`` （可以通过参数 ``v_to_u=True`` 开启使用 ``ü`` 代替 ``v`` ）。\n* 默认情况下会原样输出没有拼音的字符（自定义处理没有拼音的字符的方法见 `文档 \u003chttps://pypinyin.readthedocs.io/zh_CN/master/usage.html#handle-no-pinyin\u003e`__ ）。\n* ``嗯`` 的拼音并不是大部分人以为的 ``en`` 以及存在既没有声母也没有韵母的拼音，详见下方 FAQ 中的说明。\n\n命令行工具：\n\n.. code-block:: console\n\n    $ pypinyin 音乐\n    yīn yuè\n\n    $ python -m pypinyin.tools.toneconvert to-tone 'zhong4 xin1'\n    zhòng xīn\n\n\n文档\n--------\n\n详细文档请访问：https://pypinyin.readthedocs.io/。\n\n项目代码开发方面的问题可以看看 `开发文档`_ 。\n\n\nFAQ\n---------\n\n拼音有误？\n+++++++++++++++++++++++++++++\n\n可以通过下面的方法提高拼音准确性：\n\n* 可以通过自定义词组拼音库或者单字拼音库的方式修正拼音结果，\n  详见 `文档 \u003chttps://pypinyin.readthedocs.io/zh_CN/master/usage.html#custom-dict\u003e`__ 。\n\n.. code-block:: python\n\n    \u003e\u003e from pypinyin import load_phrases_dict, load_single_dict\n\n    \u003e\u003e load_phrases_dict({'桔子': [['jú'], ['zǐ']]})  # 增加 \"桔子\" 词组\n\n    \u003e\u003e load_single_dict({ord('还'): 'hái,huán'})  # 调整 \"还\" 字的拼音顺序或覆盖默认拼音\n\n* 也可以使用 `pypinyin-dict \u003chttps://github.com/mozillazg/pypinyin-dict\u003e`__ 项目提供的自定义拼音库来纠正结果。\n\n.. code-block:: python\n\n    # 使用 phrase-pinyin-data 项目中 cc_cedict.txt 文件中的拼音数据优化结果\n    \u003e\u003e\u003e from pypinyin_dict.phrase_pinyin_data import cc_cedict\n    \u003e\u003e\u003e cc_cedict.load()\n\n    # 使用 pinyin-data 项目中 kXHC1983.txt 文件中的拼音数据优化结果\n    \u003e\u003e\u003e from pypinyin_dict.pinyin_data import kxhc1983\n    \u003e\u003e\u003e kxhc1983.load()\n\n* 如果是分词导致的拼音有误的话，可以先使用其他的分词模块对数据进行分词处理，\n  然后将分词后的词组结果列表作为函数的参数即可:\n\n.. code-block:: python\n\n    \u003e\u003e\u003e # 使用其他分词模块分词，比如 jieba 之类，\n    \u003e\u003e\u003e #或者基于 phrases_dict.py 里的词语数据使用其他分词算法分词\n    \u003e\u003e\u003e words = list(jieba.cut('每股24.67美元的确定性协议'))\n    \u003e\u003e\u003e pinyin(words)\n\n* 如果你希望能通过训练模型的方式提高拼音准确性的话，可以看一下 `pypinyin-g2pW \u003chttps://github.com/mozillazg/pypinyin-g2pW\u003e`__ 这个项目。\n\n\n为什么没有 y, w, yu 几个声母？\n++++++++++++++++++++++++++++++++++++++++++++\n\n.. code-block:: python\n\n    \u003e\u003e\u003e from pypinyin import Style, pinyin\n    \u003e\u003e\u003e pinyin('下雨天', style=Style.INITIALS)\n    [['x'], [''], ['t']]\n\n因为根据 `《汉语拼音方案》 \u003chttp://www.moe.gov.cn/jyb_sjzl/ziliao/A19/195802/t19580201_186000.html\u003e`__ ，\ny，w，ü (yu) 都不是声母。\n\n    声母风格（INITIALS）下，“雨”、“我”、“圆”等汉字返回空字符串，因为根据\n    `《汉语拼音方案》 \u003chttp://www.moe.gov.cn/jyb_sjzl/ziliao/A19/195802/t19580201_186000.html\u003e`__ ，\n    y，w，ü (yu) 都不是声母，在某些特定韵母无声母时，才加上 y 或 w，而 ü 也有其特定规则。    —— @hotoo\n\n    **如果你觉得这个给你带来了麻烦，那么也请小心一些无声母的汉字（如“啊”、“饿”、“按”、“昂”等）。\n    这时候你也许需要的是首字母风格（FIRST_LETTER）**。    —— @hotoo\n\n    参考: `hotoo/pinyin#57 \u003chttps://github.com/hotoo/pinyin/issues/57\u003e`__,\n    `#22 \u003chttps://github.com/mozillazg/python-pinyin/pull/22\u003e`__,\n    `#27 \u003chttps://github.com/mozillazg/python-pinyin/issues/27\u003e`__,\n    `#44 \u003chttps://github.com/mozillazg/python-pinyin/issues/44\u003e`__\n\n如果觉得这个行为不是你想要的，就是想把 y 当成声母的话，可以指定 ``strict=False`` ，\n这个可能会符合你的预期：\n\n.. code-block:: python\n\n    \u003e\u003e\u003e from pypinyin import Style, pinyin\n    \u003e\u003e\u003e pinyin('下雨天', style=Style.INITIALS)\n    [['x'], [''], ['t']]\n    \u003e\u003e\u003e pinyin('下雨天', style=Style.INITIALS, strict=False)\n    [['x'], ['y'], ['t']]\n\n详见 `strict 参数的影响`_ 。\n\n存在既没有声母也没有韵母的拼音？\n+++++++++++++++++++++++++++++++++\n\n是的，``strict=True`` 模式下存在极少数既没有声母也没有韵母的拼音。\n比如下面这些拼音（来自汉字 ``嗯``、``呒``、``呣``、``唔``）::\n\n    ń ńg ňg ǹg ň ǹ m̄ ḿ m̀\n\n尤其需要注意的是 ``嗯`` 的所有拼音都既没有声母也没有韵母，``呣`` 的默认拼音既没有声母也没有韵母。\n详见 `#109`_ `#259`_ `#284`_ 。\n\n\n如何将某一风格的拼音转换为其他风格的拼音？\n++++++++++++++++++++++++++++++++++++++++++++\n\n可以通过 ``pypinyin.contrib.tone_convert`` 模块提供的辅助函数对标准拼音进行转换，得到不同风格的拼音。\n比如将 ``zhōng`` 转换为 ``zhong``，或者获取拼音中的声母或韵母数据：\n\n.. code-block:: python\n\n    \u003e\u003e\u003e from pypinyin.contrib.tone_convert import to_normal, to_tone, to_initials, to_finals\n    \u003e\u003e\u003e to_normal('zhōng')\n    'zhong'\n    \u003e\u003e\u003e to_tone('zhong1')\n    'zhōng'\n    \u003e\u003e\u003e to_initials('zhōng')\n    'zh'\n    \u003e\u003e\u003e to_finals('zhōng')\n    'ong'\n\n更多拼音转换的辅助函数，详见 ``pypinyin.contrib.tone_convert`` 模块的\n`文档 \u003chttps://pypinyin.readthedocs.io/zh_CN/master/contrib.html#tone-convert\u003e`__ 。\n\n\n如何减少内存占用？\n++++++++++++++++++++\n\n如果对拼音的准确性不是特别在意的话，可以通过设置环境变量 ``PYPINYIN_NO_PHRASES``\n和 ``PYPINYIN_NO_DICT_COPY`` 来节省内存。\n详见 `文档 \u003chttps://pypinyin.readthedocs.io/zh_CN/master/faq.html#no-phrases\u003e`__\n\n\n更多 FAQ 详见文档中的\n`FAQ \u003chttps://pypinyin.readthedocs.io/zh_CN/master/faq.html\u003e`__ 部分。\n\n\n.. _#13 : https://github.com/mozillazg/python-pinyin/issues/113\n.. _strict 参数的影响: https://pypinyin.readthedocs.io/zh_CN/master/usage.html#strict\n\n\n拼音数据\n---------\n\n* 单个汉字的拼音使用 `pinyin-data`_ 的数据\n* 词组的拼音使用 `phrase-pinyin-data`_ 的数据\n* 声母和韵母使用 `《汉语拼音方案》 \u003chttp://www.moe.gov.cn/jyb_sjzl/ziliao/A19/195802/t19580201_186000.html\u003e`__ 的数据\n\n\nRelated Projects\n-----------------\n\n* `hotoo/pinyin`__: 汉字拼音转换工具 Node.js/JavaScript 版。\n* `mozillazg/go-pinyin`__: 汉字拼音转换工具 Go 版。\n* `mozillazg/rust-pinyin`__: 汉字拼音转换工具 Rust 版。\n* `wolfgitpr/cpp-pinyin`__: 汉字拼音转换工具 c++ 版。\n* `wolfgitpr/csharp-pinyin`__: 汉字拼音转换工具 c# 版。\n\n\n__ https://github.com/hotoo/pinyin\n__ https://github.com/mozillazg/go-pinyin\n__ https://github.com/mozillazg/rust-pinyin\n__ https://github.com/wolfgitpr/cpp-pinyin\n__ https://github.com/wolfgitpr/csharp-pinyin\n\n\n.. |Build| image:: https://img.shields.io/circleci/project/github/mozillazg/python-pinyin/master.svg\n   :target: https://circleci.com/gh/mozillazg/python-pinyin\n.. |GitHubAction| image:: https://github.com/mozillazg/python-pinyin/workflows/CI/badge.svg\n   :target: https://github.com/mozillazg/python-pinyin/actions\n.. |Coverage| image:: https://img.shields.io/coveralls/github/mozillazg/python-pinyin/master.svg\n   :target: https://coveralls.io/github/mozillazg/python-pinyin\n.. |PyPI version| image:: https://img.shields.io/pypi/v/pypinyin.svg\n   :target: https://pypi.org/project/pypinyin/\n.. |DOI| image:: https://zenodo.org/badge/12830126.svg\n   :target: https://zenodo.org/badge/latestdoi/12830126\n.. |PyPI downloads| image:: https://img.shields.io/pypi/dm/pypinyin.svg\n   :target: https://pypi.org/project/pypinyin/\n\n\n\n.. _Russian translation: https://github.com/mozillazg/python-pinyin/blob/master/README_ru.rst\n.. _pinyin-data: https://github.com/mozillazg/pinyin-data\n.. _phrase-pinyin-data: https://github.com/mozillazg/phrase-pinyin-data\n.. _开发文档: https://pypinyin.readthedocs.io/zh_CN/develop/develop.html\n.. _#109: https://github.com/mozillazg/python-pinyin/issues/109\n.. _#259: https://github.com/mozillazg/python-pinyin/issues/259\n.. _#284: https://github.com/mozillazg/python-pinyin/issues/284\n","funding_links":["https://mozillazg.com/wechat_donate.jpeg","https://patreon.com/mozillazg"],"categories":["HarmonyOS","Text Processing","Python","资源列表","文本处理","Text Processing [🔝](#readme)","Awesome Python"],"sub_categories":["Windows Manager","文本处理","Text Processing"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmozillazg%2Fpython-pinyin","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmozillazg%2Fpython-pinyin","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmozillazg%2Fpython-pinyin/lists"}