https://github.com/isawnyu/oracc2csv

The Open Richly Annotated Cuneiform Corpus (ORACC) publishes JSON data for each of its projects. Sometimes you want the catalog data listing each text to be in CSV format. This package does that.
https://github.com/isawnyu/oracc2csv

csv cuneiform json oracc

Last synced: 2 months ago
JSON representation

The Open Richly Annotated Cuneiform Corpus (ORACC) publishes JSON data for each of its projects. Sometimes you want the catalog data listing each text to be in CSV format. This package does that.

Host: GitHub
URL: https://github.com/isawnyu/oracc2csv
Owner: isawnyu
License: agpl-3.0
Created: 2022-06-26T09:46:41.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2022-06-26T09:47:06.000Z (about 4 years ago)
Last Synced: 2026-04-30T13:34:55.346Z (2 months ago)
Topics: csv, cuneiform, json, oracc
Language: Python
Homepage:
Size: 8.68 MB
Stars: 1
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

          

# oracc2csv

The [Open Richly Annotated Cuneiform Corpus (ORACC)](http://oracc.museum.upenn.edu/) publishes JSON data for each of its projects. Sometimes you want the catalog data listing each text to be in CSV format. This package does that.

This program was written by [Tom Elliott](https://orcid.org/0000-0002-4114-6677) for the [Institute for the Study of the Ancient World (NYU)](https://isaw.nyu.edu) and is Copyright 2022 by New York University. It is licensed under the GNU Affero General Public License (see LICENSE.txt).

## Install

Create a python 3.10.4+ virtual environment. Download or clone this package from GitHub. Run:

```

pip install -U -r requirements_dev.txt

```

## Use

Download the zip file of the ORACC project you're interested in (e.g., http://oracc.org/json/hbtin.zip). Run the oracc2csv `dump` script:

```

> python scripts/dump.py -v ~/oracc/hbtin ~/scratch

INFO:root:logging level changed to INFO via command line option; was WARNING

INFO:oracc2csv:Loaded corpus from /Users/banana/oracc/hbtin:

HBTIN: Hellenistic Babylonia: Texts, Iconography, Names

Cuneiform texts, iconography and onomastic data from Hellenistic Babylonia, primarily from Uruk. HBTIN texts form the demonstrator corpus of the Berkeley Prosopography Service (BPS).  Directed by Laurie Pearce at UC Berkeley.

572 entries

INFO:oracc2csv:Wrote corpus to /Users/banana/scratch

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/isawnyu/oracc2csv

Awesome Lists containing this project

README