Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mnyrop/bunraku-ipy
ipython notebooks for processing bunraku collection data @cul :jp: :dolls:
https://github.com/mnyrop/bunraku-ipy
digital-humanities ipython json-schema jupyter-notebook pandas-dataframe relational-model static-site
Last synced: 15 days ago
JSON representation
ipython notebooks for processing bunraku collection data @cul :jp: :dolls:
- Host: GitHub
- URL: https://github.com/mnyrop/bunraku-ipy
- Owner: mnyrop
- Created: 2017-05-26T20:28:34.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2017-10-14T02:35:08.000Z (over 7 years ago)
- Last Synced: 2024-11-15T00:38:19.875Z (3 months ago)
- Topics: digital-humanities, ipython, json-schema, jupyter-notebook, pandas-dataframe, relational-model, static-site
- Language: Jupyter Notebook
- Homepage:
- Size: 13.1 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# bunraku-ipy
Jupyter notebooks &etc. for processing data from the __[Barbara Curtis Adachi Bunraku](https://cul.github.io/bunraku-demo/)__ (Japanese Puppet Theater) Collection.
## pipeline(s):
#### online collection data / [bunraku-online.ipynb](https://github.com/mnyrop/bunraku-ipy/blob/master/bunraku-online.ipynb)| | |
|-------------:|:-------------|
|![#f03c15](https://placehold.it/15/f03c15/000000?text=+) | __Cake PHP site powered by Relational MYSQL database__ |
| 1 | MySQL dump to CSVs |
| 2 | Import CSVs into __[IPython](https://ipython.org/)__ as __[Pandas](http://pandas.pydata.org/)__ Dataframes |
| 3 | Merge relational data (from CSV jointables) onto Dataframes by type |
| 4 | Export Dataframes as JSON records (and CSVs, for archival purposes only). |
| 5 | Drop null key:value pairs from JSON (bash __[JQ](https://stedolan.github.io/jq/)__) |
| 6 | Convert (no nulls) JSON to YAML (bash __[Pyyaml](http://pyyaml.org/)__) |
| 7 | Generate __[Jekyll collections](https://jekyllrb.com/docs/collections/)__ (and pages) from YAML using __[Yaml-Splitter plugin](https://github.com/mnyrop/yaml-splitter)__ |
| ![#c5f015](https://placehold.it/15/c5f015/000000?text=+) | __Static Jekyll site powered by YAML data, with JSON index for static search__ |#### total collection data / [bunraku-full.ipynb](https://github.com/mnyrop/bunraku-ipy/blob/master/bunraku-full.ipynb)
The data accessible on the original PHP site (as well as the new Jekyll site) represents only about 60% or so of the information stored in the MySQL database. To preserve that information for future use, I used a separate Ipy notebook/pipeline to output CSVs and JSON where images/media marked 'offline' were not dropped.
## stats:
There is also a Jupyter notebook for generating matplotlib graphs and D3-specific/refactored JSON for data visualization. (__[bunraku-stats.ipynb](https://github.com/mnyrop/bunraku-ipy/blob/master/stats/bunraku-stats.ipynb)__)