{"id":18541868,"url":"https://github.com/cltk/latin_pos_lemmata_cltk","last_synced_at":"2025-04-09T18:31:19.650Z","repository":{"id":15963867,"uuid":"18706519","full_name":"cltk/latin_pos_lemmata_cltk","owner":"cltk","description":null,"archived":false,"fork":false,"pushed_at":"2016-06-08T18:57:14.000Z","size":80776,"stargazers_count":11,"open_issues_count":0,"forks_count":2,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-24T10:12:41.567Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cltk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-04-12T14:56:41.000Z","updated_at":"2021-02-28T21:02:44.000Z","dependencies_parsed_at":"2022-09-24T06:10:25.559Z","dependency_job_id":null,"html_url":"https://github.com/cltk/latin_pos_lemmata_cltk","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cltk%2Flatin_pos_lemmata_cltk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cltk%2Flatin_pos_lemmata_cltk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cltk%2Flatin_pos_lemmata_cltk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cltk%2Flatin_pos_lemmata_cltk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cltk","download_url":"https://codeload.github.com/cltk/latin_pos_lemmata_cltk/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248087703,"owners_count":21045571,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T20:06:33.923Z","updated_at":"2025-04-09T18:31:14.641Z","avatar_url":"https://github.com/cltk.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# About\n\nThis repository contains part of speech (POS) data for the Latin language, made specifically for use with the [CLTK](https://github.com/kylepjohnson/cltk). Only CLTK developers need this repository. To tag parts of speech, [see here](http://docs.cltk.org/en/latest/classical_latin.html#pos-tagging).\n\nThe directory `perseus_data` contains part of speech data taken from Perseus. `pos_latin.tar.gz` contains `cltk_latin_pos_dict.txt` (252MB, too large to upload to GitHub) and is what is picked up by the corpus importer.\n\n`parse_latin_analyses.py` was used to create `latin-analyses.txt` into Python's dictionary type. \n\n# Use\n``` python\n\nIn [1]: import re\n\nIn [2]: import ast\n\nIn [3]: p = re.compile('\\w+', re.IGNORECASE)\n\nIn [4]: with open('/Users/kyle/cltk_data/compiled/pos_latin/cltk_latin_pos_dict.txt') as f:\n    r = f.read()\n\nIn [5]: d = ast.literal_eval(r)\n\nIn [6]: s = 'Multum tibi esse animi scio; nam etiam antequam instrueres te praeceptis salutaribus et dura vincentibus, satis adversus fortunam placebas tibi, et multo magis postquam cum illa manum conseruisti viresque expertus es tuas, quae numquam certam dare fiduciam sui possunt nisi cum multae difficultates hinc et illinc apparuerunt, aliquando vero et propius accesserunt.'\n\nIn [7]: for match in p.finditer(s):\n    w = match.group()\n    try:\n        tag = d[w]['perseus_pos']\n        print(w, tag)\n    except:\n            pass\nmultum [{'pos0': {'gloss': 'a penalty', 'number': 'pl', 'gender': 'neut', 'type': 'substantive', 'case': 'gen'}}, {'pos1': {'gloss': '', 'number': 'pl', 'gender': 'masc', 'type': 'substantive', 'case': 'gen'}}, {'pos2': {'gloss': '', 'number': 'pl', 'gender': 'neut', 'type': 'substantive', 'case': 'gen'}}, {'pos3': {'gloss': '', 'number': 'pl', 'gender': 'masc', 'type': 'substantive', 'case': 'gen'}}, {'pos4': {'gloss': '', 'number': 'sg', 'gender': 'masc', 'type': 'substantive', 'case': 'acc'}}, {'pos5': {'gloss': '', 'number': 'sg', 'gender': 'neut', 'type': 'substantive', 'case': 'nom'}}, {'pos6': {'gloss': '', 'number': 'sg', 'gender': 'neut', 'type': 'substantive', 'case': 'acc'}}, {'pos6': {'gloss': '', 'number': 'sg', 'gender': 'neut', 'type': 'substantive', 'case': 'acc'}}, {'pos6': {'gloss': '', 'number': 'sg', 'gender': 'neut', 'type': 'substantive', 'case': 'acc'}}]\ntibi [{'pos0': {'gloss': '', 'number': 'sg', 'gender': 'masc/fem', 'type': 'substantive', 'case': 'dat'}}, {'pos0': {'gloss': '', 'number': 'sg', 'gender': 'masc/fem', 'type': 'substantive', 'case': 'dat'}}]\nesse [{'pos0': {'gloss': '', 'tense': 'pres', 'type': 'verb', 'voice': 'act'}}, {'pos1': {'gloss': '', 'tense': 'pres', 'type': 'verb', 'voice': 'act'}}]\nanimi [{'pos0': {'gloss': 'the rational soul in man', 'number': 'pl', 'gender': 'masc', 'type': 'substantive', 'case': 'voc'}}, {'pos0': {'gloss': 'the rational soul in man', 'number': 'pl', 'gender': 'masc', 'type': 'substantive', 'case': 'voc'}}, {'pos1': {'gloss': 'the rational soul in man', 'number': 'sg', 'gender': 'masc', 'type': 'substantive', 'case': 'gen'}}]\nscio [{'pos0': {'gloss': '', 'person': '1st', 'type': 'verb', 'number': 'sg', 'tense': 'pres', 'voice': 'act', 'mood': 'ind'}}, {'pos1': {'gloss': 'knowing', 'number': 'sg', 'gender': 'neut', 'type': 'substantive', 'case': 'abl'}}, {'pos2': {'gloss': 'knowing', 'number': 'sg', 'gender': 'masc', 'type': 'substantive', 'case': 'abl'}}, {'pos3': {'gloss': 'knowing', 'number': 'sg', 'gender': 'neut', 'type': 'substantive', 'case': 'dat'}}, {'pos4': {'gloss': 'knowing', 'number': 'sg', 'gender': 'masc', 'type': 'substantive', 'case': 'dat'}}]\nnam [{'pos0': {'gloss': '', 'type': 'conj', 'case': 'indeclform'}}]\netiam [{'pos0': {'gloss': '', 'type': 'conj', 'case': 'indeclform'}}]\nantequam [{'pos0': {'gloss': '', 'type': 'conj', 'case': 'indeclform'}}]\ninstrueres [{'pos0': {'gloss': '', 'person': '2nd', 'type': 'verb', 'number': 'sg', 'tense': 'imperf', 'voice': 'act', 'mood': 'subj'}}]\nte [{'pos0': {'gloss': '', 'number': 'sg', 'gender': 'masc/fem', 'type': 'substantive', 'case': 'abl'}}, {'pos0': {'gloss': '', 'number': 'sg', 'gender': 'masc/fem', 'type': 'substantive', 'case': 'abl'}}, {'pos1': {'gloss': '', 'number': 'sg', 'gender': 'masc/fem', 'type': 'substantive', 'case': 'acc'}}, {'pos1': {'gloss': '', 'number': 'sg', 'gender': 'masc/fem', 'type': 'substantive', 'case': 'acc'}}]\n...\n```\n\n# License\n\nThis software is, like the rest of the CLTK, licensed under the MIT license (see LICENSE). Perseus data comes from [the Perseus Hopper](sourceforge.net/projects/perseus-hopper) and is licensed under the [Mozilla Public License 1.1 (MPL 1.1)](\u003chttp://www.mozilla.org/MPL/1.1/).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcltk%2Flatin_pos_lemmata_cltk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcltk%2Flatin_pos_lemmata_cltk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcltk%2Flatin_pos_lemmata_cltk/lists"}