{"id":13765933,"url":"https://github.com/ppke-nlpg/emmorphpy","last_synced_at":"2026-01-26T17:53:50.898Z","repository":{"id":72545887,"uuid":"136040744","full_name":"ppke-nlpg/emmorphpy","owner":"ppke-nlpg","description":"A wrapper, a lemmatizer and REST API implemented in Python for emMorph (Humor) Hungarian morphological analyzer","archived":false,"fork":false,"pushed_at":"2019-11-06T20:30:23.000Z","size":26821,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-11-17T01:33:16.168Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ppke-nlpg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-06-04T14:52:41.000Z","updated_at":"2023-11-01T20:01:46.000Z","dependencies_parsed_at":null,"dependency_job_id":"572d85fc-e301-4775-a5c8-855ab6387b06","html_url":"https://github.com/ppke-nlpg/emmorphpy","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ppke-nlpg%2Femmorphpy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ppke-nlpg%2Femmorphpy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ppke-nlpg%2Femmorphpy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ppke-nlpg%2Femmorphpy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ppke-nlpg","download_url":"https://codeload.github.com/ppke-nlpg/emmorphpy/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253485746,"owners_count":21916072,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T16:00:49.219Z","updated_at":"2026-01-26T17:53:50.870Z","avatar_url":"https://github.com/ppke-nlpg.png","language":"Python","funding_links":[],"categories":["Tools"],"sub_categories":["Morphology"],"readme":"# __Warning: This repository might not contain the newest version of the source code. The development is continued at https://github.com/dlt-rilmta/emmorphpy__ \n\n---\n\n# emMorphPy\nA wrapper and lemmatizer implemented in Python for ___emMorph__ (Humor) Hungarian morphological analyzer_ \n\n## Requirements\n\n  - (Included in this repository) The compiled FST (hu.hfstol): go to https://github.com/dlt-rilmta/emMorph for compilation details\n  - (Included in this repository) The lemmatizer config file: available at https://github.com/dlt-rilmta/hunlp-GATE/blob/master/Lang_Hungarian/resources/hfst/hfst-wrapper.props\n  - _hfst-lookup 0.6 (hfst 3.13.0)_ or higher: On Ubuntu 18.04 LTS or higher just `sudo apt install hfst`\n  - Python 3 (\u003e=3.5, tested with 3.6)\n  - Pip to install the additional requirements in requirements.txt\n  - (Optional) a cloud service like [Heroku](https://heroku.com) for hosting the API\n\n## Features\n - Stemming and returning the detailed morphological analyses with the proper transducer and config file\n - Handling extra and exceptional lexicons statically and dynamically (see [emmorphpy/emmorphpy.py](https://github.com/ppke-nlpg/emmorphpy/blob/master/emmorphpy/emmorphpy.py) for details)\n - Can be used through REST API (using [xtsv](https://github.com/dlt-rilmta/xtsv)), or from Python as a library (see usage examples below)\n\n## Install on local machine\n\n  - Clone the repository\n  - Run: `sudo pip3 install -r requirements.txt`\n  - Use from Python\n\n## Install to Heroku\n\n  - Register\n  - Download Heroku CLI\n  - Login to Heroku from the CLI\n  - Create an app\n  - Clone the repository\n  - Add Heroku as remote origin\n  - Add APT buildpack: `heroku buildpacks:add --index 1 https://github.com/heroku/heroku-buildpack-apt`\n  - Add Python buildpack: `heroku buildpacks:add --index 2 heroku/python`\n  - Push the repository to Heroku\n  - Enjoy!\n\n## Usage\n\n  - From browser or anyhow through the REST API:\n     - Lemmatization: https://emmorph.herokuapp.com/stem/működik\n     - Detailed analysis: https://emmorph.herokuapp.com/analyze/működik\n     - Lemmatisation with the corresponding detailed analysis: https://emmorph.herokuapp.com/dstem/működik\n     - The library also support HTTP POST requests to handle multiple words at once. (See examples for details.)\n\n\t```python\n\t\u003e\u003e\u003e import requests\n\t\u003e\u003e\u003e import json\n\t\u003e\u003e\u003e word = 'működik'\n\t\u003e\u003e\u003e json.loads(requests.get('https://emmorph.herokuapp.com/stem/' + word).text)[word]\n\t[{'lemma': 'működik', 'tag': '[/V][Prs.Def.3Pl]'}, {'lemma': 'működik', 'tag': '[/V][Prs.NDef.3Sg]'}]\n\t\u003e\u003e\u003e json.loads(requests.get('https://emmorph.herokuapp.com/analyze/' + word).text)[word]\n\t[{'morphana': 'működik[/V]=működ+ik[Prs.Def.3Pl]=ik'}, {'morphana': 'működik[/V]=működ+ik[Prs.NDef.3Sg]=ik'}]\n\t\u003e\u003e\u003e json.loads(requests.get('https://emmorph.herokuapp.com/dstem/' + word).text)[word]\n    [{'lemma': 'működik', 'tag': '[/V][Prs.Def.3Pl]', 'morphana': 'működik[/V]=működ+ik[Prs.Def.3Pl]=ik', 'readable': 'működik[/V]=működ + ik[Prs.Def.3Pl]', 'twolevel': 'm:m ű:ű k:k ö:ö d:d :i :k :[/V] i:i k:k :[Prs.Def.3Pl]'}, {'lemma': 'működik', 'tag': '[/V][Prs.NDef.3Sg]', 'morphana': 'működik[/V]=működ+ik[Prs.NDef.3Sg]=ik', 'readable': 'működik[/V]=működ + ik[Prs.NDef.3Sg]', 'twolevel': 'm:m ű:ű k:k ö:ö d:d :i :k :[/V] i:i k:k :[Prs.NDef.3Sg]'}]\n\t\u003e\u003e\u003e words = '\\n'.join(('form', word, 'word2', ''))  # One word per line (first line is header, trailing newline is needed!)\n\t\u003e\u003e\u003e words_out = requests.post('https://emmorph.herokuapp.com/stem', files={'file': words}).text.split('\\n')\n\t\u003e\u003e\u003e print(words_out[1].split('\\t'))\n\t['működik', '[{\"lemma\": \"működik\", \"tag\": \"[/V][Prs.Def.3Pl]\"}, {\"lemma\": \"működik\", \"tag\": \"[/V][Prs.NDef.3Sg]\"}]']\n\t\u003e\u003e\u003e words_out = requests.post('https://emmorph.herokuapp.com/analyze', files={'file': words}).text.split('\\n')\n\t\u003e\u003e\u003e print(words_out[1].split('\\t'))\n\t['működik', '[{\"morphana\": \"működik[/V]=működ+ik[Prs.Def.3Pl]=ik\"}, {\"morphana\": \"működik[/V]=működ+ik[Prs.NDef.3Sg]=ik\"}]']\n    \u003e\u003e\u003e words_out = requests.post('https://emmorph.herokuapp.com/dstem', files={'file': words}).text.split('\\n')\n\t\u003e\u003e\u003e print(words_out[1].split('\\t'))\n\t['működik', '[{\"lemma\": \"működik\", \"tag\": \"[/V][Prs.Def.3Pl]\", \"morphana\": \"működik[/V]=működ+ik[Prs.Def.3Pl]=ik\", \"readable\": \"működik[/V]=működ + ik[Prs.Def.3Pl]\", \"twolevel\": \"m:m ű:ű k:k ö:ö d:d :i :k :[/V] i:i k:k :[Prs.Def.3Pl]\"}, {\"lemma\": \"működik\", \"tag\": \"[/V][Prs.NDef.3Sg]\", \"morphana\": \"működik[/V]=működ+ik[Prs.NDef.3Sg]=ik\", \"readable\": \"működik[/V]=működ + ik[Prs.NDef.3Sg]\", \"twolevel\": \"m:m ű:ű k:k ö:ö d:d :i :k :[/V] i:i k:k :[Prs.NDef.3Sg]\"}]']\n\t```\n \n  - From Python:\n\n\t```python\n\t\u003e\u003e\u003e import emmorphpy.emmorphpy as emmorph\n\t\u003e\u003e\u003e m = emmorph.EmMorphPy()\n\t\u003e\u003e\u003e m.stem('működik')     # Returns list of lemmatisations (stem and tag pairs)\n\t[('működik', '[/V][Prs.Def.3Pl]'), ('működik', '[/V][Prs.NDef.3Sg]')]\n\t\u003e\u003e\u003e m.analyze('működik')  # Returns list of detailed analyzes (word by morphemes)\n\t['működik[/V]=működ+ik[Prs.Def.3Pl]=ik', 'működik[/V]=működ+ik[Prs.NDef.3Sg]=ik']\n\t\u003e\u003e\u003e m.dstem('működik')    # Returns list of lemmatisations with the corresponding detailed analyzes (stem, tag and detailed analyzes triples)\n\t[('működik', '[/V][Prs.Def.3Pl]', 'működik[/V]=működ+ik[Prs.Def.3Pl]=ik', 'működik[/V]=működ + ik[Prs.Def.3Pl]', 'm:m ű:ű k:k ö:ö d:d :i :k :[/V] i:i k:k :[Prs.Def.3Pl]'), ('működik', '[/V][Prs.NDef.3Sg]', 'működik[/V]=működ+ik[Prs.NDef.3Sg]=ik', 'működik[/V]=működ + ik[Prs.NDef.3Sg]', 'm:m ű:ű k:k ö:ö d:d :i :k :[/V] i:i k:k :[Prs.NDef.3Sg]')]\n\t\u003e\u003e\u003e # Add new analyses to the lexicon (Not a paradigm, but a single analysis!) Format: [('STEM', 'TAG', 'DETAILED_ANALYSIS', 'HFST-OUTPUT')]\n\t\u003e\u003e\u003e m.lexicon['Obamával'] = [('Obama', '[/N][Nom]', '', ''), ('Obam', '[/N][Nom]', '', ''), ('Obamá', '[/N][Nom]', '', '')]\n\t\u003e\u003e\u003e # Add new exceptions to the lexicon (Exact matches will be filtered out ASAP!) Format: ('HFST-OUTPUT')\n\t\u003e\u003e\u003e m.exceptions['almával'] = {'a:a l:l :o m:m :[/N] á:a :[Poss.3Sg] v:v a:a l:l :[Ins]'}  \n\t```\n\n## License\n\nThis Python wrapper, the lemmatizer implementation is licensed under the LGPL 3.0 license.\nxtsv, HFST, the database and the lemmatizer configuration has their own license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fppke-nlpg%2Femmorphpy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fppke-nlpg%2Femmorphpy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fppke-nlpg%2Femmorphpy/lists"}