{"id":13482392,"url":"https://github.com/proycon/pynlpl","last_synced_at":"2025-05-16T09:06:19.041Z","repository":{"id":966862,"uuid":"759484","full_name":"proycon/pynlpl","owner":"proycon","description":"PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).","archived":false,"fork":false,"pushed_at":"2023-09-14T12:24:10.000Z","size":13435,"stargazers_count":477,"open_issues_count":3,"forks_count":67,"subscribers_count":30,"default_branch":"master","last_synced_at":"2025-05-10T01:08:57.362Z","etag":null,"topics":["computational-linguistics","evaluation-metrics","folia","language-modelling","library","linguistics","machine-learning","natural-language-processing","nlp","nlp-library","python","search-algorithms","text-processing"],"latest_commit_sha":null,"homepage":"https://pypi.python.org/pypi/PyNLPl","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/proycon.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS"}},"created_at":"2010-07-06T11:42:27.000Z","updated_at":"2025-03-24T15:38:15.000Z","dependencies_parsed_at":"2022-08-16T11:40:15.766Z","dependency_job_id":"e7eb242b-043a-40a5-ae00-654e996f2ac1","html_url":"https://github.com/proycon/pynlpl","commit_stats":{"total_commits":2093,"total_committers":13,"mean_commits":161.0,"dds":"0.40181557572861926","last_synced_commit":"7707f69a91caaa6cde037f0d0379f1d42500a68b"},"previous_names":[],"tags_count":41,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proycon%2Fpynlpl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proycon%2Fpynlpl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proycon%2Fpynlpl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proycon%2Fpynlpl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/proycon","download_url":"https://codeload.github.com/proycon/pynlpl/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254501558,"owners_count":22081528,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computational-linguistics","evaluation-metrics","folia","language-modelling","library","linguistics","machine-learning","natural-language-processing","nlp","nlp-library","python","search-algorithms","text-processing"],"created_at":"2024-07-31T17:01:01.539Z","updated_at":"2025-05-16T09:06:14.032Z","avatar_url":"https://github.com/proycon.png","language":"Python","readme":"PyNLPl - Python Natural Language Processing Library\n=====================================================\n\n.. image:: https://travis-ci.org/proycon/pynlpl.svg?branch=master\n    :target: https://travis-ci.org/proycon/pynlpl\n\n.. image:: http://readthedocs.org/projects/pynlpl/badge/?version=latest\n\t:target: http://pynlpl.readthedocs.io/en/latest/?badge=latest\n\t:alt: Documentation Status\n\n.. image:: http://applejack.science.ru.nl/lamabadge.php/pynlpl\n   :target: http://applejack.science.ru.nl/languagemachines/\n\n.. image:: https://zenodo.org/badge/759484.svg\n   :target: https://zenodo.org/badge/latestdoi/759484\n\nPyNLPl, pronounced as 'pineapple', is a Python library for Natural Language\nProcessing. It contains various modules useful for common, and less common, NLP\ntasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and\nfrequency lists, and to build simple language model. There are also more\ncomplex data types and algorithms. Moreover, there are parsers for file formats\ncommon in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to\ninterface with various NLP specific servers. PyNLPl most notably features a\nvery extensive library for working with FoLiA XML (Format for Linguistic\nAnnotatation).\n\nThe library is a divided into several packages and modules. It works on Python\n2.7, as well as Python 3.\n\nThe following modules are available:\n\n- ``pynlpl.datatypes`` - Extra datatypes (priority queues, patterns, tries)\n- ``pynlpl.evaluation`` - Evaluation \u0026 experiment classes (parameter search, wrapped\n  progressive sampling, class evaluation (precision/recall/f-score/auc), sampler, confusion matrix, multithreaded experiment pool)\n- ``pynlpl.formats.cgn`` - Module for parsing CGN (Corpus Gesproken Nederlands) part-of-speech tags\n- ``pynlpl.formats.folia`` - Extensive library for reading and manipulating the\n  documents in `FoLiA \u003chttp://proycon.github.io/folia\u003e`_ format (Format for Linguistic Annotation).\n- ``pynlpl.formats.fql`` - Extensive library for the FoLiA Query Language (FQL),\n  built on top of ``pynlpl.formats.folia``. FQL is currently documented `here\n  \u003chttps://github.com/proycon/foliadocserve\u003e`__.\n- ``pynlpl.formats.cql`` - Parser for the Corpus Query Language (CQL), as also used by\n  Corpus Workbench and Sketch Engine. Contains a convertor to FQL.\n- ``pynlpl.formats.giza`` - Module for reading GIZA++ word alignment data\n- ``pynlpl.formats.moses`` - Module for reading Moses phrase-translation tables.\n- ``pynlpl.formats.sonar`` - Largely obsolete module for pre-releases of the\n  SoNaR corpus, use ``pynlpl.formats.folia`` instead.\n- ``pynlpl.formats.timbl`` - Module for reading Timbl output (consider using\n  `python-timbl \u003chttps://github.com/proycon/python-timbl\u003e`_ instead though)\n- ``pynlpl.lm.lm`` - Module for simple language model and reader for ARPA\n  language model data as well (used by SRILM).\n- ``pynlpl.search`` - Various search algorithms (Breadth-first, depth-first,\n  beam-search, hill climbing, A star, various variants of each)\n- ``pynlpl.statistics`` - Frequency lists, Levenshtein, common statistics and\n  information theory functions\n- ``pynlpl.textprocessors`` - Simple tokeniser, n-gram extraction\n\nInstallation\n--------------------\n\nDownload and install the latest stable version directly from the Python Package\nIndex with ``pip install pynlpl`` (or ``pip3`` for Python 3 on most\nsystems). For global installations prepend ``sudo``.\n\nAlternatively, clone this repository and run ``python setup.py install`` (or\n``python3 setup.py install`` for Python 3 on most system. Prepend ``sudo`` for\nglobal installations.\n\nThis software may also be found in the certain Linux distributions, such as\nthe latest versions as Debian/Ubuntu, as ``python-pynlpl`` and ``python3-pynlpl``.\nPyNLPL is also included in our `LaMachine \u003chttp://proycon.github.io/LaMachine\u003e`_ distribution.\n\nDocumentation\n--------------------\n\nAPI Documentation can be found `here \u003chttp://pynlpl.readthedocs.io/en/latest/\u003e`__.\n\n\n","funding_links":[],"categories":["Resources and Frameworks","Libraries","Python","函式庫","Packages"],"sub_categories":["Videos and Online Courses","General-Purpose Machine Learning","書籍","Libraries"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fproycon%2Fpynlpl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fproycon%2Fpynlpl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fproycon%2Fpynlpl/lists"}