Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/altomator/alto-iiif
Extracting illustrations from ALTO documents with IIIF
https://github.com/altomator/alto-iiif
alto-xml iiif perl
Last synced: about 1 month ago
JSON representation
Extracting illustrations from ALTO documents with IIIF
- Host: GitHub
- URL: https://github.com/altomator/alto-iiif
- Owner: altomator
- Created: 2015-12-31T17:46:31.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2016-01-18T16:03:17.000Z (almost 9 years ago)
- Last Synced: 2024-07-30T19:09:34.824Z (5 months ago)
- Topics: alto-xml, iiif, perl
- Language: Perl
- Size: 2.92 MB
- Stars: 5
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Extracting illustrations from ALTO files with IIIF
### Synopsis
Extracting illustrations described in OCRed documents (ALTO format) with IIIF API.[Full presentation in French](https://altomator.wordpress.com/2015/11/15/extraire-les-illustrations-dune-collection-de-documents-alto-avec-iiif/)
### Installation
You will need 4 scripts :1. filterIMG.sh (shell)
2. processURLs.pl (Perl)
3. extractIMG.pl (Perl)
4. extractMD.pl (Perl)A batch.sh script chains the commands.
The documents must be stored in a "DOCS" folder.
The images will be generated in a "IMG" folder.
The metadata will be generated in a "MD" folder.### Tests
1. Open a command line terminal.
2. > filterIMG.sh
2. > perl processURLs.pl illustrations.txt
3. > perl extractIMG.pl illustrations.txt_URL 200 -- minimal size in Ko of the extracted images
4. > perl extractMD.pl illustrations.txt_URL## License
CC0