{"id":15520720,"url":"https://github.com/proycon/python-frog","last_synced_at":"2025-04-05T06:10:42.363Z","repository":{"id":20492018,"uuid":"23770267","full_name":"proycon/python-frog","owner":"proycon","description":"Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)","archived":false,"fork":false,"pushed_at":"2025-03-20T16:22:42.000Z","size":125,"stargazers_count":49,"open_issues_count":6,"forks_count":10,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-04-04T11:07:16.697Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Cython","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/proycon.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-09-07T20:32:31.000Z","updated_at":"2025-03-23T17:11:28.000Z","dependencies_parsed_at":"2023-02-13T00:01:36.963Z","dependency_job_id":"6b51d1e8-55d5-4baa-879b-7937deac7e00","html_url":"https://github.com/proycon/python-frog","commit_stats":{"total_commits":118,"total_committers":2,"mean_commits":59.0,"dds":0.008474576271186418,"last_synced_commit":"c94bd8155520a7c2ef768397c68a912bad67483f"},"previous_names":[],"tags_count":24,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proycon%2Fpython-frog","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proycon%2Fpython-frog/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proycon%2Fpython-frog/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proycon%2Fpython-frog/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/proycon","download_url":"https://codeload.github.com/proycon/python-frog/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247294541,"owners_count":20915340,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-02T10:29:03.345Z","updated_at":"2025-04-05T06:10:42.344Z","avatar_url":"https://github.com/proycon.png","language":"Cython","readme":".. image:: http://applejack.science.ru.nl/lamabadge.php/python-frog\n   :target: http://applejack.science.ru.nl/languagemachines/\n\n.. image:: https://zenodo.org/badge/23770267.svg\n   :target: https://zenodo.org/badge/latestdoi/23770267\n\n.. image:: https://www.repostatus.org/badges/latest/active.svg\n   :alt: Project Status: Active – The project has reached a stable, usable state and is being actively developed.\n   :target: https://www.repostatus.org/#active\n\nFrog for Python\n===================\n\nThis is a Python binding to the Natural Language Processing suite Frog. Frog is\nintended for Dutch and performs part-of-speech tagging, lemmatisation,\nmorphological analysis, named entity recognition, shallow parsing, and\ndependency parsing. The tool itself is implemented in C++\n(https://languagemachines.github.io/frog). The binding requires Python 3.6 or higher.\n\nDemo\n------------------\n\n.. image:: https://raw.githubusercontent.com/CLARIAH/wp3-demos/master/python-frog.gif \n\nInstallation\n----------------\n\nWe recommend you use a Python virtual environment and install using ``pip``::\n\n    pip install python-frog\n\nWhen possible on your system, this will install the binary\nPython wheels *that include Frog and all necessary dependencies* **except for**\nfrogdata. To download and install the data (in ``~/.config/frog``) you then only need to\nrun the following once::\n\n    python -c \"import frog; frog.installdata()\"\n\nIf you want language detection support, ensure you the have `libexttextcat`\npackage (if provided by your distribution) installed prior to executing the\nabove command.\n\nIf the binary wheels are not available for your system, you will need to first\ninstall `Frog \u003chttps://github.com/LanguageMachines/frog\u003e`_ yourself and then\nrun ``pip install python-frog`` to install this python binding, it will then be\ncompiled from source. The following instructions apply in that case:\n\nOn Arch Linux, you can alternatively use the `AUR package \u003chttps://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=python-frog-git\u003e`_ .\n\nOn macOS; first use `homebrew \u003chttps://brew.sh/\u003e`_ to install `Frog \u003chttps://languagemachines.github.io/frog\u003e`_::\n\n    brew tap fbkarsdorp/homebrew-lamachine\n    brew install ucto\n\nOn Alpine Linux, run: ``apk add cython frog frog-dev``\n\nWindows is not supported natively at all, but you should be able to use the Ucto python binding if you use WSL, or using Docker containers (see below).\n\nDocker/OCI Containers\n~~~~~~~~~~~~~~~~~~~~~~~\n\nA Docker/OCI container image is available containing Python, frog, and python-frog::\n\n    docker pull proycon/python-frog\n    docker run -t -i proycon/python-frog\n\nYou can also build the container from scratch from this repository with the included `Dockerfile`.\n\nUsage\n------------------\n\nExample:\n\n.. code:: python\n\n    from frog import Frog, FrogOptions\n\n    frog = Frog(FrogOptions(parser=False))\n    output = frog.process_raw(\"Dit is een test\")\n    print(\"RAW OUTPUT=\",output)\n    output = frog.process(\"Dit is nog een test.\")\n    print(\"PARSED OUTPUT=\",output)\n\n\nOutput::\n\n    RAW OUTPUT= 1   Dit     dit     [dit]   VNW(aanw,pron,stan,vol,3o,ev)\n    0.777085        O       B-NP\n    2       is      zijn    [zijn]  WW(pv,tgw,ev)   0.999891        O\n    B-VP\n    3       een     een     [een]   LID(onbep,stan,agr)     0.999113        O\n    B-NP\n    4       test    test    [test]  N(soort,ev,basis,zijd,stan)     0.789112\n    O       I-NP\n\n\n    PARSED OUTPUT= [{'chunker': 'B-NP', 'index': '1', 'lemma': 'dit', 'ner':\n    'O', 'pos': 'VNW(aanw,pron,stan,vol,3o,ev)', 'posprob': 0.777085, 'text':\n    'Dit', 'morph': '[dit]'}, {'chunker': 'B-VP', 'index': '2', 'lemma':\n    'zijn', 'ner': 'O', 'pos': 'WW(pv,tgw,ev)', 'posprob': 0.999966, 'text':\n    'is', 'morph': '[zijn]'}, {'chunker': 'B-NP', 'index': '3', 'lemma': 'nog',\n    'ner': 'O', 'pos': 'BW()', 'posprob': 0.99982, 'text': 'nog', 'morph':\n    '[nog]'}, {'chunker': 'I-NP', 'index': '4', 'lemma': 'een', 'ner': 'O',\n    'pos': 'LID(onbep,stan,agr)', 'posprob': 0.995781, 'text': 'een', 'morph':\n    '[een]'}, {'chunker': 'I-NP', 'index': '5', 'lemma': 'test', 'ner': 'O',\n    'pos': 'N(soort,ev,basis,zijd,stan)', 'posprob': 0.903055, 'text': 'test',\n    'morph': '[test]'}, {'chunker': 'O', 'index': '6', 'eos': True, 'lemma':\n    '.', 'ner': 'O', 'pos': 'LET()', 'posprob': 1.0, 'text': '.', 'morph':\n    '[.]'}]\n\n\nAvailable keyword arguments for FrogOptions:\n\n* tok - True/False - Do tokenisation? (default: True)\n* lemma - True/False - Do lemmatisation? (default: True)\n* morph - True/False - Do morpholigical analysis? (default: True)\n* daringmorph - True/False - Do morphological analysis in new experimental style? (default: False)\n* mwu - True/False - Do Multi Word Unit detection? (default: True)\n* chunking - True/False - Do Chunking/Shallow parsing? (default: True)\n* ner - True/False - Do Named Entity Recognition? (default: True)\n* parser - True/False - Do Dependency Parsing? (default: False).\n* xmlin - True/False - Input is FoLiA XML (default: False)\n* xmlout - True/False - Output is FoLiA XML (default: False)\n* docid - str - Document ID (for FoLiA)\n* numThreads - int - Number of threads to use (default: unset, unlimited)\n\nYou can specify a Frog configuration file explicitly as second argument upon instantiation, otherwise the default one is\nused:\n\n.. code:: python\n\n    frog = Frog(FrogOptions(parser=False), \"/path/to/your/frog.cfg\")\n\nA third parameter, a dictionary, can be used to override specific configuration values (same syntax as Frog's\n``--override`` option), you may want to leave the second parameter empty if you want to load the default configuration:\n\n.. code:: python\n\n    frog = Frog(FrogOptions(parser=False), \"\", { \"tokenizer.rulesFile\": \"tokconfig-nld-twitter\" })\n\nFoLiA support\n------------------\n\nFrog supports output in the `FoLiA XML format \u003chttps://proycon.github.io/folia\u003e`_ (set ``FrogOptions(xmlout=True)``), as\nwell as FoLiA input (set ``FrogOptions(xmlin=True)``). The FoLiA format exposes more details about the linguistic\nannotation in a more structured and more formal way.\n\nWhenever FoLiA output is requested, the ``process()`` method will return an instance of ``folia.Document``, which is\nprovided by the `FoLiApy library \u003chttps://github.com/proycon/foliapy\u003e`_. This loads the entire FoLiA document in memory and\nallows you to inspect it in any way you see fit. Extensive documentation for this library can be found here:\nhttp://folia.readthedocs.io/\n\nAn example can be found below:\n\n.. code:: python\n\n    from frog import Frog, FrogOptions\n\n    frog = Frog(FrogOptions(parser=True,xmlout=True))\n    output = frog.process(\"Dit is een FoLiA test.\")\n    #output is now no longer a string but an instance of folia.Document, provided by the FoLiA library in PyNLPl (pynlpl.formats.folia)\n    print(\"FOLIA OUTPUT AS RAW XML=\")\n    print(output.xmlstring())\n\n    print(\"Inspecting FoLiA output (just a small example):\")\n    for word in output.words():\n        print(word.text() + \" \" + word.pos() + \" \" + word.lemma())\n\n","funding_links":[],"categories":["其他語言","Python","Packages"],"sub_categories":["函式庫與嵌入","General-Purpose Machine Learning","Libraries"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fproycon%2Fpython-frog","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fproycon%2Fpython-frog","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fproycon%2Fpython-frog/lists"}