{"id":41315015,"url":"https://github.com/dracor-org/gerdracor","last_synced_at":"2026-01-23T05:29:46.014Z","repository":{"id":39620169,"uuid":"75316529","full_name":"dracor-org/gerdracor","owner":"dracor-org","description":"German Drama Corpus","archived":false,"fork":false,"pushed_at":"2026-01-13T16:45:57.000Z","size":144862,"stargazers_count":10,"open_issues_count":3,"forks_count":13,"subscribers_count":9,"default_branch":"main","last_synced_at":"2026-01-13T18:43:29.191Z","etag":null,"topics":["corpus","digital-humanities","drama","dramatic-texts","tei","xml"],"latest_commit_sha":null,"homepage":null,"language":"CSS","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dracor-org.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2016-12-01T17:34:38.000Z","updated_at":"2026-01-13T16:46:01.000Z","dependencies_parsed_at":"2025-10-22T09:29:51.372Z","dependency_job_id":null,"html_url":"https://github.com/dracor-org/gerdracor","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dracor-org/gerdracor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dracor-org%2Fgerdracor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dracor-org%2Fgerdracor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dracor-org%2Fgerdracor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dracor-org%2Fgerdracor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dracor-org","download_url":"https://codeload.github.com/dracor-org/gerdracor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dracor-org%2Fgerdracor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28680694,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-23T04:33:33.518Z","status":"ssl_error","status_checked_at":"2026-01-23T04:33:30.433Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["corpus","digital-humanities","drama","dramatic-texts","tei","xml"],"created_at":"2026-01-23T05:29:44.789Z","updated_at":"2026-01-23T05:29:46.004Z","avatar_url":"https://github.com/dracor-org.png","language":"CSS","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GerDraCor\n## Corpus Description\nThis is the German Drama Corpus (GerDraCor), a collection of [TEI P5](https://tei-c.org/guidelines/p5/)-encoded German-language plays from the 1500s to the 1940s. The corpus is released under the Creative Commons Zero copyright waiver ([CC0](https://creativecommons.org/share-your-work/public-domain/cc0/)).\n\nIf you want to cite the corpus, please use this publication:\n\n- **Fischer, Frank, et al. (2019)**. Programmable Corpora: Introducing DraCor, an Infrastructure for the Research on European Drama. In *Proceedings of DH2019: \"Complexities\"*, Utrecht University, [doi:10.5281/zenodo.4284002](https://doi.org/10.5281/zenodo.4284002).\n\nWe started to build the corpus by extracting all plays from TextGrid Repository (TGRep). The source for the versions in TGRep was [zeno.org's](http://www.zeno.org/) text collection. However, TGRep's conversion from zeno.org's proprietary XML to TEI caused some bugs and inconsistencies which we fixed for GerDraCor in a longer process between 2017 and 2019. [All our fixes including enhancements are documented on GerDraCor's Wiki.](https://github.com/dracor-org/gerdracor/wiki/Documentation-for-Correcting-Plays-from-TextGrid-Repository) After this clean-up process, GerDraCor is now in a position to grow by taking on new plays from sources such as Deutsches Textarchiv, Project Gutenberg, Projekt Gutenberg-DE, Wikisource, or Google Books.\n\nGerDraCor is an autonomous corpus and will be maintained independently. Yet it is also integrated into the [dracor.org website](https://dracor.org/), the showcase for our newly introduced **\"Programmable Corpora\"** concept.\n\nIf you just want to download the corpus in its current state in XML-TEI, do this:\n\n`svn export https://github.com/dracor-org/gerdracor/trunk/tei`\n\n### Credits\n\n* Editors: [Frank Fischer](https://lehkost.github.io/), [Peer Trilcke](https://www.uni-potsdam.de/de/lit-19-jhd/peertrilcke/)\n* Support during the initial compilation of the corpus from TextGrid Repository: Mathias Göbel, [Dario Kampkaspar](https://www.ulb.tu-darmstadt.de/die_bibliothek/ueberuns/organisation/kontakt_details_17792.en.jsp) (Technical University of Darmstadt/ACDH-CH, Vienna)\n* Additional encoders: [Erik Renz](https://www.germanistik.uni-rostock.de/personen/wiss-mitarbeitende/erik-renz/) (University of Rostock)\n* Bibliographic research: [Lilly Welz](https://www.temporal-communities.de/people/welz/index.html) (Freie Universität Berlin)\n* Character annotations: [Nathalie Wiedmer](https://uni-tuebingen.de/forschung/forschungsschwerpunkte/sonderforschungsbereiche/sfb-andere-aesthetik/organisation/mitglieder-alphabetisch/nathalie-wiedmer/) (University of Tübingen), [Janis Pagel](https://janispagel.de/), [Nils Reiter](https://nilsreiter.de/) (both University of Cologne)\n\n### Character Relations\n\nCharacter relations encode the information provided in the *dramatis personae* and make it machine-readable. This is mainly about family and power relations.\n\nThe following relations have been annotated (by [Nathalie Wiedmer et al.](https://quadrama.github.io/publications/Wiedmer2020aa)):\n\n| Relation label | Directed/Undirected | Description |\n| ----- | ----- | ------ |\n| `parent_of` | directed | One character is a parent of the other |\n| `lover_of` | directed | For lovers |\n| `related_with` | directed | Other **family** relations (e.g., uncles) |\n| `associated_with` | directed | For clearly associated characters (e.g., butlers) |\n| `siblings` | undirected | Characters that have at least one parent in common |\n| `spouses` | undirected | Characters in marriage (or engaged) |\n| `friends` | undirected | Characters marked as being friends |\n\nAll relations are marked in XML in the `\u003clistPerson\u003e` element within `\u003clistRelation\u003e`. Directed relations are encoded with an `active` and `passive` attribute where the active part is always the one in front of the relation, if expressed as a sentence. E.g., *Odoardo is parent of Emilia* translates to this:\n\n  \u003crelation name=\"parent_of\" active=\"#odoardo_galotti\" passive=\"#emilia\" /\u003e\n\nUndirected relations use the `mutual` attribute to collect all IDs that are part of a relationship:\n\t\n  \u003crelation name=\"spouses\" mutual=\"#baerbel #adam\"/\u003e\n\nThe label from the table above is contained in the `name` attribute.\n\n## API\nAn easy way to download the network data (instead of the actual TEI files) is to use our API ([documentation here](https://dracor.org/doc/api)). If you have [jq](https://blog.appoptics.com/jq-json/) installed, it would work like this:\n\n```\nfor play in `curl 'https://dracor.org/api/corpora/ger' | jq -r \".dramas[] .name\"`; do\n    wget -O \"$play\".csv https://dracor.org/api/corpora/ger/play/\"$play\"/networkdata/csv\ndone\n```\n\nThe API info page is at `https://dracor.org/api/info`. It also tells you which version of eXist-db we're running on dracor.org.\n\n## Simple Visualisation with R\nTo take a first look at the distribution of the number of speakers per play over time, you could feed the metadata table into R:\n\n```\nlibrary(data.table)\nlibrary(ggplot2)\ngerdracor \u003c- fread(\"https://dracor.org/api/corpora/ger/metadata/csv\")\nggplot(gerdracor[], aes(x = yearNormalized, y = numOfSpeakers)) + geom_point()\n```\n\nResult:\n\n![number of speakers per play over time](numOfSpeakers.png)\n\nHere is a barplot showing the number of plays per decade (outdated, not containing most recent changes):\n\n![number of plays per decade](playsPerDecade.png)\n\n## A Bit of History\nUntil we rebuilt our working corpus under its new name GerDraCor, we've been working with an [intermediary format](https://github.com/dlina/project/tree/master/data/zwischenformat) to conduct [our research](https://dlina.github.io/talks/). This format only held structural information, not the texts themselves. Back then, our research group called itself DLINA (digitally-enabled literary network analysis). Since our focus broadened, we stopped using this name. Our future endeavours will sail under the **Programmable Corpora** flag.\n\n(README last updated on January 21, 2026.)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdracor-org%2Fgerdracor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdracor-org%2Fgerdracor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdracor-org%2Fgerdracor/lists"}