{"id":20885850,"url":"https://github.com/xennis/epidoc-parser","last_synced_at":"2025-05-12T19:31:36.941Z","repository":{"id":43393488,"uuid":"253337152","full_name":"Xennis/epidoc-parser","owner":"Xennis","description":"Parser for EpiDoc (Epigraphic Documents in TEI XML)","archived":false,"fork":false,"pushed_at":"2024-07-15T23:00:58.000Z","size":79,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-07-16T03:02:33.573Z","etag":null,"topics":["epidoc","epigraphy","papyri","parser","tei-xml"],"latest_commit_sha":null,"homepage":"https://xennis.github.io/epidoc-parser/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Xennis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-04-05T21:38:05.000Z","updated_at":"2024-07-15T23:01:00.000Z","dependencies_parsed_at":"2024-04-02T21:28:00.534Z","dependency_job_id":"1566a9a6-6968-49e0-bac1-df373bc7e0ff","html_url":"https://github.com/Xennis/epidoc-parser","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xennis%2Fepidoc-parser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xennis%2Fepidoc-parser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xennis%2Fepidoc-parser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xennis%2Fepidoc-parser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Xennis","download_url":"https://codeload.github.com/Xennis/epidoc-parser/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225148802,"owners_count":17428430,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["epidoc","epigraphy","papyri","parser","tei-xml"],"created_at":"2024-11-18T08:14:50.565Z","updated_at":"2025-05-12T19:31:36.932Z","avatar_url":"https://github.com/Xennis.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# EpiDoc Parser\n\n[![Python](https://github.com/Xennis/epidoc-parser/actions/workflows/python.yml/badge.svg?branch=main\u0026event=push)](https://github.com/Xennis/epidoc-parser/actions/workflows/python.yml?query=event%3Apush+branch%3Amain)\n\nPython parser for EpiDoc (epigraphic documents in TEI XML).\n\nFor example [idp.data-sheet](https://github.com/Xennis/idp.data-sheet) uses the parser to generate a single CSV sheet of the [Papyri.info Integrating Digital Papyrology data](https://github.com/papyri/idp.data).\n\n## Usage\n\n### Installation \n\nInstall the package\n```shell\npip install git+https://github.com/Xennis/epidoc-parser\n```\n\n### Load a document\n\nLoad a document from a file\n```python\nimport epidoc\n\nwith open(\"my-epidoc.xml\") as f:\n    doc = epidoc.load(f)\n```\n\nLoad a document from a string\n```python\nimport epidoc\n\nmy_epidoc = \"\"\"\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003c?xml-model href=\"http://www.stoa.org/epidoc/schema/8.13/tei-epidoc.rng\" type=\"application/xml\" schematypens=\"http://relaxng.org/ns/structure/1.0\"?\u003e\n\u003cTEI xmlns=\"http://www.tei-c.org/ns/1.0\" xml:id=\"hgv74005\"\u003e\n   [...]\n\u003c/TEI\u003e\n\"\"\"\n\ndoc = epidoc.loads(my_epidoc)\n```\n\n### Get data from a document\n\nCall the attributes, for example\n```python\n\u003e\u003e\u003e doc.title\n\"Ordre de paiement\"\n\u003e\u003e\u003e doc.material\n\"ostrakon\"\n\u003e\u003e\u003e doc.languages\n{\"en\": \"Englisch\", \"la\": \"Latein\", \"el\": \"Griechisch\"}\n\u003e\u003e\u003e [t.get(\"text\") for t in doc.terms]\n[\"Anweisung\", \"Zahlung\", \"Getreide\"]\n\u003e\u003e\u003e doc.origin_place.get(\"text\")\n\"Kysis (Oasis Magna)\"\n\u003e\u003e\u003e doc.origin_dates[0]\n{\"notbefore\": \"0301\", \"notafter\": \"0425\", \"precision\": \"low\", \"text\": \"IV - Anfang V\"}\n```\n\n## Documentation\n\n| Field                     | EpiDoc source element (XPath)                                                  |\n|---------------------------|--------------------------------------------------------------------------------|\n| commentary                | `//body/div[@type='commentary' and @subtype='general']`                        |\n| edition_foreign_languages | `//body/div[@type='edition']//foreign/@xml:lang`                               |\n| edition_language          | `//body/div[@type='edition']/@xml:lang`                                        |\n| idno                      | `//teiHeader/fileDesc/publicationStmt/idno`                                    |\n| authority                 | `//teiHeader/fileDesc/publicationStmt/authority`                               |\n| availability              | `//teiHeader/fileDesc/publicationStmt/availability`                            |\n| languages                 | `//teiHeader/profileDesc/langUsage/language`                                   |\n| material                  | `//teiHeader/fileDesc/sourceDesc/msDesc/physDesc/objectDesc//support/material` |\n| origin_dates              | `//teiHeader/fileDesc/sourceDesc/msDesc/history/origin/origDate`               |\n| origin_place              | `//teiHeader/fileDesc/sourceDesc/msDesc/history/origin/origPlace`              |\n| provenances               | `//teiHeader/fileDesc/sourceDesc/msDesc/history/provenance`                    |\n| reprint_from              | `//body/ref[@type='reprint-from']`                                             |\n| reprint_in                | `//body/ref[@type='reprint-in']`                                               |\n| terms                     | `//teiHeader/profileDesc/textClass//term`                                      |\n| title                     | `//teiHeader/fileDesc/titleStmt/title`                                         |\n\n## Development\n\nCreate a virtual environment, enable it and install the dependencies\n```shell\npython3 -m venv venv\n. venv/bin/activate\npip install --requirement requirements.txt\n```\n\nRun the test\n```shell\nmake unittest\n```\n\n## LICENSE\n\n### Code\n\nsee [LICENSE](LICENSE)\n\n### Test data\n\nThe test data in this project is from the project [idp.data](https://github.com/papyri/idp.data) by [Papyri.info](http://papyri.info). This data is made available under a [Creative Commons Attribution 3.0 License](http://creativecommons.org/licenses/by/3.0/), with copyright and attribution to the respective projects.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxennis%2Fepidoc-parser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxennis%2Fepidoc-parser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxennis%2Fepidoc-parser/lists"}