{"id":16284086,"url":"https://github.com/wichert/lingua","last_synced_at":"2026-02-27T15:13:22.293Z","repository":{"id":1234947,"uuid":"1170713","full_name":"wichert/lingua","owner":"wichert","description":"Translation toolkit for Python","archived":false,"fork":false,"pushed_at":"2023-10-10T17:08:11.000Z","size":461,"stargazers_count":44,"open_issues_count":12,"forks_count":32,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-05-22T09:43:02.780Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wichert.png","metadata":{"files":{"readme":"README.rst","changelog":"changes.rst","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2010-12-15T09:58:43.000Z","updated_at":"2024-03-11T22:22:33.000Z","dependencies_parsed_at":"2024-04-10T05:38:33.499Z","dependency_job_id":"0584259a-f383-424b-817d-03d290c2ce26","html_url":"https://github.com/wichert/lingua","commit_stats":null,"previous_names":[],"tags_count":53,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wichert%2Flingua","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wichert%2Flingua/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wichert%2Flingua/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wichert%2Flingua/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wichert","download_url":"https://codeload.github.com/wichert/lingua/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247492515,"owners_count":20947545,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-10T19:17:07.989Z","updated_at":"2026-02-27T15:13:22.261Z","avatar_url":"https://github.com/wichert.png","language":"Python","funding_links":[],"categories":["Translations"],"sub_categories":[],"readme":"What is lingua?\n===============\n\nLingua is a package with tools to extract translatable texts from\nyour code, and to check existing translations. It replaces the use\nof the ``xgettext`` command from gettext, or ``pybabel`` from Babel.\n\n\nMessage extraction\n==================\n\nThe simplest way to extract all translatable messages is to point the\n``pot-create`` tool at the root of your source tree.\n\n::\n\n     $ pot-create src\n\nThis will create a ``messages.pot`` file containing all found messages.\n\n\nSpecifying input files\n----------------------\n\nThere are three ways to tell lingua which files you want it to scan:\n\n1. Specify filenames directly on the command line. For example::\n\n   $ pot-create main.py utils.py\n\n2. Specify a directory on the command line. Lingua will recursively scan that\n   directory for all files it knows how to handle.\n\n   ::\n\n       $ pot-create src\n\n3. Use the ``--files-from`` parameter to point to a file with a list of\n   files to scan. Lines starting with ``#`` and empty lines will be ignored.\n\n   ::\n\n       $ pot-create --files-from=POTFILES.in\n\nYou can also use the ``--directory=PATH`` parameter to add the given path to the\nlist of directories to check for files. This may sound confusing, but can be\nuseful. For example this command will look for ``main.py`` and ``utils.py`` in\nthe current directory, and if they are not found there in the ``../src``\ndirectory::\n\n\n    $ pot-create --directory=../src main.py utils.py\n\n\nConfiguration\n-------------\n\nIn its default configuration lingua will use its python extractor for ``.py``\nfiles, its XML extractor for ``.pt`` and ``.zpt`` files and its ZCML extractor\nfor ``.zcml`` files. If you use different extensions you setup a configuration\nfile which tells lingua how to process files. This file uses a simple ini-style\nformat.\n\nThere are two types of configuration that can be set in the configuration file:\nwhich extractor to use for a file extension, and the configuration for a single\nextractor.\n\nFile extensions are configured in the ``extensions`` section. Each entry in\nthis section maps a file extension to an extractor name. For example to\ntell lingua to use its XML extractor for files with a ``.html`` extension\nyou can use this configuration::\n\n    [extensions]\n    .html = xml\n\nTo find out which extractors are available use the ``-list-extractors`` option.\n\n::\n\n    $ bin/pot-create --list-extractors\n    chameleon         Chameleon templates (defaults to Python expressions)\n    python            Python sources\n    xml               Chameleon templates (defaults to Python expressions)\n    zcml              Zope Configuration Markup Language (ZCML)\n    zope              Zope templates (defaults to TALES expressions)\n\nA section named `extractor:\u003cname\u003e` can be used to configure a specific\nextractor. For example to tell the XML extractor that the default language\nused for expressions is TALES instead of Python::\n\n    [extractor:xml]\n    default-engine = tales\n\nEither place a global configuration file named ``.config/lingua`` to your\nhome folder or use the ``--config`` option to point lingua to your\nconfiguration file.\n\n::\n\n    $ pot-create -c lingua.cfg src\n\n\nDomain filtering\n----------------\n\nWhen working with large systems you may use multiple translation domains\nin a single source tree. Lingua can support that by filtering messages by\ndomain when scanning sources. To enable domain filtering use the ``-d`` option:\n\n::\n\n    $ pot-create -d mydomain src\n\nLingua will always include messages for which it can not determine the domain.\nFor example, take this Python code:\n\n::\n\n     print(gettext(u'Hello, World'))\n     print(dgettext('mydomain', u'Bye bye'))\n\nThe first hello-message does not specify its domain and will always be\nincluded. The second line uses `dgettext\n\u003chttp://docs.python.org/2/library/gettext#gettext.dgettext\u003e`_ to explicitly\nspecify the domain. Lingua will use this information when filtering domains.\n\n\nIncluding comments\n------------------\n\nYou can add comments to messages to help translators, for example to explain\nhow a text is used, or provide hints on how it should be translated. For\nchameleon templates this can be done using the ``i18n:comment`` attribute:\n\n::\n\n   \u003clabel i18n:comment=\"This is a form label\" i18n:translate=\"\"\u003ePassword\u003c/label\u003e\n\nComments are inherited, so you can put them on a parent element as well.\n\n::\n\n   \u003cform i18n:comment=\"This is used in the password reset form\"\u003e\n     \u003clabel i18n:translate=\"\"\u003ePassword\u003c/label\u003e\n     \u003cbutton i18n:translate=\"\"\u003eChange\u003c/button\u003e\n   \u003c/form\u003e\n\n\nFor Python code you can tell lingua to include comments by using the\n``--add-comments`` option. This will make Linua include all comments on the\nline(s) *immediately preceeding* (there may be no empty line in between) a\ntranslation call.\n\n::\n\n    # This text should address the user directly.\n    return _('Thank you for using our service.')\n\nAlternatively you can also put a comment at the end of the line starting your\ntranslation function call.\n\n::\n\n    return _('Thank you for using our service.')  # Address the user directly\n\nIf you do not want all comments to be included but only specific ones you can\nadd a keyword to the ``--add-comments`` option, for example ``--add-comments=I18N``.\n\n::\n\n    # I18N This text should address the user directly, and use formal addressing.\n    return _('Thank you for using our service')\n\n\nSetting message flags in comments\n---------------------------------\n\nMessages can have *flags*. These are to indicate what format a message has, and\nare typically used by validation tools to check if a translation does not break\nvariable references or template syntax. Lingua does a reasonable job to detect\nstrings using C and Python formatting, but sometimes you may need to set flags\nyourself. This can be done with a ``[flag, flag]`` marker in a comment.\n\n::\n\n    # I18N [markdown,c-format]\n    header =  _(u'# Hello *%s*')\n\n\n\nSpecifying keywords\n-------------------\n\nWhen looking for messages a lingua parser uses a default list of keywords\nto identify translation calls. You can add extra keywords via the ``--keyword``\noption. If you have your own ``mygettext`` function which takes a string\nto translate as its first parameter you can use this:\n\n::\n\n    $ pot-create --keyword=mygettext\n\nIf your function takes more parameters you will need to tell lingua about them.\nThis can be done in several ways:\n\n* If the translatable text is not the first parameter you can specify the\n  parameter number with ``\u003ckeyword\u003e:\u003cparameter number\u003e``. For example if\n  you use ``i18n_log(level, msg)`` the keyword specifier would be ``i18n_log:2``\n* If you support plurals you can specify the parameter used for the plural message\n  by specifying the parameter number for both the singular and plural text. For\n  example if your function signature is ``show_result(single, plural)`` the\n  keyword specifier is ``show_result:1,2``\n* If you use message contexts you can specify the parameter used for the context\n  by adding a ``c`` to the parameter number. For example the keyword specifier for\n  ``pgettext`` is ``pgettext:1c,2``.\n* If your function takes the domain as a parameter you can specify which parameter\n  is used for the domain by adding a ``d`` to the parameter number. For example\n  the keyword specifier for ``dgettext`` is ``dgettext:1d,2``. This is a\n  lingua-specified extension.\n* You can specify the exact number of parameters a function call must have\n  using the ``t`` postfix. For example if a function *must* have four parameters\n  to be a valid call, the specifier could be ``myfunc:1,4t``.\n\n\nExtractors\n==========\n\nLingua includes a number of extractors:\n\n* `python`: handles Python source code.\n* `chameleon`: handles `Chameleon \u003chttp://www.pagetemplates.org/\u003e`_ files,\n  using the `Zope i18n syntax\n  \u003chttps://chameleon.readthedocs.org/en/latest/reference.html#id51\u003e`_\n* `zcml`: handles Zope Configuration Markup Language (ZCML) files.\n* `zope`: a variant of the chameleon extractor, which assumes the default\n   expression language is `TALES\n   \u003chttps://chameleon.readthedocs.org/en/latest/reference.html#expressions-tales\u003e`_\n   instead of Python.\n* `xml`: old name for the `chameleon` extractor. This name should not be used\n  anymore and is only supported for backwards compatibility.\n\nBabel extractors\n----------------\n\nThere are several packages with plugins for `Babel\n\u003chttp://babel.edgewall.org/\u003e`_'s message extraction tool. Lingua can use those\nplugins as well. The plugin names will be prefixed with ``babel-`` to\ndistinguish them from lingua extractors.\n\nFor example, if you have the `PyBabel-json\n\u003chttps://pypi.python.org/pypi/PyBabel-json\u003e`_ package installed you can\ninstruct lingua to use it for .json files by adding this to your configuration\nfile::\n\n     [extensions]\n     .json = babel-json\n\nSome Babel plugins require you to specify comment tags. This can be set with\nthe ``comment-tags`` option.\n\n::\n\n    [extractor:babel-mako]\n    comment-tags = TRANSLATOR:\n\n\nComparison to other tools\n=========================\n\nDifferences compared to `GNU gettext \u003chttps://www.gnu.org/software/gettext/\u003e`_:\n\n* Support for file formats such as Zope Page Templates (popular in\n  `Pyramid \u003chttp://docs.pylonsproject.org/projects/pyramid/en/latest/\u003e`_,\n  `Chameleon`_,\n  `Plone \u003chttp://plone.org/\u003e`_ and `Zope \u003chttp://www.zope.org\u003e`_).\n* Better support for detecting format strings used in Python.\n* No direct support for C, C++, Perl, and many other languages. Lingua focuses\n  on languages commonly used in Python projects, although support for other\n  languages can be added via plugins.\n\n\nDifferences compared to `Babel`_:\n\n* More reliable detection of Python format strings.\n* Lingua includes plural support.\n* Support for only extracting texts for a given translation domain. This is\n  often useful for extensible software where you use multiple translation\n  domains in a single application.\n\n\nValidating translations\n=======================\n\nLingua includes a simple ``polint`` tool which performs a few basic checks on\nPO files. Currently implemented tests are:\n\n* duplicated message ids (can also be checked with GNU gettext's ``msgfmt``).\n  These should never happen and are usually a result of a bug in the message\n  extraction logic.\n\n* identical translations used for multiple canonical texts. This can happen\n  for valid reasons, for example when the original text is not spelled\n  consistently.\n\nTo check a po file simply run ``polint`` with the po file as argument::\n\n    $ polint nl.po\n\n    Translation:\n        ${val} ist keine Zeichenkette\n    Used for 2 canonical texts:\n    1       ${val} is not a string\n    2       \"${val}\" is not a string\n\n\nWriting custom extractors\n=========================\n\nFirst we need to create the custom extractor::\n\n    from lingua.extractors import Extractor\n    from lingua.extractors import Message\n\n    class MyExtractor(Extractor):\n        '''One-line description for --list-extractors'''\n        extensions = ['.txt']\n\n        def __call__(self, filename, options):\n            return [Message(None, 'msgid', None, [], u'', u'', (filename, 1))]\n\nHooking up extractors to lingua is done by ``lingua.extractors`` entry points\nin ``setup.py``::\n\n    setup(name='mypackage',\n          ...\n          install_requires=[\n              'lingua',\n          ],\n          ...\n          entry_points='''\n          [lingua.extractors]\n          my_extractor = mypackage.extractor:MyExtractor\n          '''\n          ...)\n\nNote - the registered extractor must be a class derived from the ``Extractor``\nbase class.\n\nAfter installing ``mypackage`` lingua will automatically detect the new custom\nextractor.\n\n\nHelper Script\n=============\n\nThere exists a helper shell script for managing translations of packages in\n``docs/examples`` named ``i18n.sh``. Copy it to package root where you want to\nwork on translations, edit the configuration params inside the script and use::\n\n    ./i18n.sh lang\n\nfor initial catalog creation and::\n\n    ./i18n.sh\n\nfor updating translation and compiling the catalog.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwichert%2Flingua","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwichert%2Flingua","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwichert%2Flingua/lists"}