{"id":37625650,"url":"https://github.com/timarkh/uniparser-morph","last_synced_at":"2026-01-16T10:47:55.388Z","repository":{"id":38271947,"uuid":"342536009","full_name":"timarkh/uniparser-morph","owner":"timarkh","description":"Rule-based, linguist-friendly (and rather slow) morphological analysis","archived":false,"fork":false,"pushed_at":"2025-05-09T15:32:18.000Z","size":1420,"stargazers_count":6,"open_issues_count":2,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-08-23T05:39:09.519Z","etag":null,"topics":["linguistics","morphological-analysis","nlp","pos-tagging","rule-based"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/timarkh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-02-26T10:07:43.000Z","updated_at":"2025-05-09T15:32:22.000Z","dependencies_parsed_at":"2025-05-09T12:24:38.192Z","dependency_job_id":"77178417-8b80-4e35-ac36-507143039cec","html_url":"https://github.com/timarkh/uniparser-morph","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/timarkh/uniparser-morph","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timarkh%2Funiparser-morph","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timarkh%2Funiparser-morph/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timarkh%2Funiparser-morph/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timarkh%2Funiparser-morph/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/timarkh","download_url":"https://codeload.github.com/timarkh/uniparser-morph/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timarkh%2Funiparser-morph/sbom","scorecard":{"id":885550,"data":{"date":"2025-08-11","repo":{"name":"github.com/timarkh/uniparser-morph","commit":"54e4dca92f1e326770afc8f29f8e823fdf721e73"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"Code-Review","score":0,"reason":"Found 2/27 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 5 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-24T09:53:30.643Z","repository_id":38271947,"created_at":"2025-08-24T09:53:30.643Z","updated_at":"2025-08-24T09:53:30.643Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28478060,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T06:30:42.265Z","status":"ssl_error","status_checked_at":"2026-01-16T06:30:16.248Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["linguistics","morphological-analysis","nlp","pos-tagging","rule-based"],"created_at":"2026-01-16T10:47:54.748Z","updated_at":"2026-01-16T10:47:55.379Z","avatar_url":"https://github.com/timarkh.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# uniparser-morph\n\nThis is yet another rule-based morphological analysis tool. No built-in rules are provided; you will have to write some if you want to parse texts in your language. Uniparser-morph was developed primarily for under-resourced languages, which don't have enough data for training statistical parsers. Here's how it's different from other similar tools:\n\n* It is designed to be usable by theoretical linguists with no prior knowledge of NLP (and has been successfully used by them with minimal guidance). So it's not just another way of defining an FST; the way you describe lexemes and morphology resembles what you do in a traditional theoretical description, at least in part.\n* It was developed with a large variety of linguistic phenomena in mind and is easily applicable to most languages -- not just the Standard Average European.\n* Apart from POS-tagging and full morphological tagging, there is a glossing option (words can be split into morphemes).\n* Lexemes can carry any number of attributes that have to end up in the annotation, e.g. translations into the metalanguage.\n* Ambiguity is allowed: all words you analyze will receive all theoretically possible analyses regardless of the context. (You can then use e.g. [CG](https://visl.sdu.dk/constraint_grammar.html) for rule-based disambiguation.)\n* While, in computational terms, the language described by ``uniparser-morph`` rules is certainly regular, the description is actually NOT entirely converted into an FST. Therefore, it's not nearly as fast as FST-based analyzers. The speed varies depending on the language structure and hardware characteristics, but you can hardly expect to parse more than 20,000 words per second. For heavily polysynthetic languages that figure can go as low as 200 words per second. So it's not really designed for industrial use.\n\nThe primary usage scenario I was thinking about is the following:\n\n* You have a corpus of texts where you want to add morphological annotation (this includes POS-tagging).\n* You manually prepare a grammar for the language in ``uniparser-morph`` format (probably making use of existing digital dictionaries of the language).\n* You compile a list of unique words in your corpus and parse it.\n* Then you annotate your texts based on this wordlist with any software you want.\n\nOf course, you can do other things with ``uniparser-morph``, e.g. make it a part of a more complex NLP pipeline; just make sure low speed is not an issue in your case.\n\n``uniparser-morph`` is distributed under the MIT license (see LICENSE).\n\n## Usage\nImport the ``Analyzer`` class from the package. Here is a basic usage example:\n\n```python\nfrom uniparser_morph import Analyzer\na = Analyzer()\n\n# Put your grammar files in the current folder or set paths as properties of the Analyzer class (see below)\na.load_grammar()\n\nanalyses = a.analyze_words('Морфологиез')\n# The parser is initialized before first use, so expect some delay here (usually several seconds)\n# You will get a list of Wordform objects\n\n# You can also pass lists (even nested lists) and specify output format ('xml' or 'json'):\nanalyses = a.analyze_words([['А'], ['Мон', 'тонэ', 'яратӥсько', '.']], format='xml')\nanalyses = a.analyze_words(['Морфологиез', [['А'], ['Мон', 'тонэ', 'яратӥсько', '.']]], format='json')\n```\n\nIf you need to parse a frequency list, use ``analyze_wordlist()`` instead.\n\nSee [the documentation](https://uniparser-morph.readthedocs.io/en/latest/) for the full list of options.\n\n## Format\nIf you want to create a ``uniparser-morph`` analyzer for your language, you will have to write a set of rules that describe the vocabulary and the morphology of your language in ``uniparser-morph`` format. For the description of the format, [refer to documentation](https://uniparser-morph.readthedocs.io/en/latest/) .\n\n## Disambiguation with CG\nIf you have disambiguation rules in the [Constraint Grammar](https://visl.sdu.dk/constraint_grammar.html) format, you can use them in the following way when calling ``analyze_words()``:\n\n```python\nanalyses = a.analyze_words(['Мон', 'морфологиез', 'яратӥсько', '.'],\n                           cgFile=os.path.abspath('disambiguation.cg3'),\n                           disambiguate=True)\n```\n\nIn order for this to work, you have to install the ``cg3`` executable separately. On Ubuntu/Debian, you can use ``apt-get``:\n\n```\nsudo apt-get install cg3\n```\n\nOn Windows, download the binary and add the path to the ``PATH`` environment variable. See [the documentation](https://visl.sdu.dk/cg3/single/#installation) for other options.\n\nNote that each time you call ``analyze_words()`` with ``disambiguate=True``, the CG grammar is loaded and compiled from scratch, which makes the analysis even slower. If you are analyzing a large text, it would make sense to pass the entire text contents in a single function call rather than do it sentence-by-sentence, for optimal performance.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftimarkh%2Funiparser-morph","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftimarkh%2Funiparser-morph","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftimarkh%2Funiparser-morph/lists"}