Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nakamura196/iiif_tei_py
Creating TEI/XML with results from OCR using Google Cloud Vision API
https://github.com/nakamura196/iiif_tei_py
iiif ocr python tei
Last synced: 13 days ago
JSON representation
Creating TEI/XML with results from OCR using Google Cloud Vision API
- Host: GitHub
- URL: https://github.com/nakamura196/iiif_tei_py
- Owner: nakamura196
- License: apache-2.0
- Created: 2024-08-07T12:59:19.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-08-08T09:34:31.000Z (5 months ago)
- Last Synced: 2024-11-05T16:38:11.139Z (2 months ago)
- Topics: iiif, ocr, python, tei
- Language: Jupyter Notebook
- Homepage: https://nakamura196.github.io/iiif_tei_py/
- Size: 378 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
iiif_tei_py
================Creating TEI/XML with results from OCR using Google Cloud Vision API
## Install
``` sh
pip install https://github.com/nakamura196/iiif_tei_py
```## Basic usage
### Prepare `.env` file
``` txt:.env
GOOGLE_APPLICATION_CREDENTIALS=your-google-credentials.json
```### Prepare Google Cloud Vision API credentials
``` python
from iiif_tei_py.core import CoreClient
cred_path = CoreClient.load_env()
```### Main
``` python
url = "https://iiif.io/api/presentation/2.1/example/fixtures/resources/page1-full.png"
output_tei_xml_file_path = "./tmp/01/output.xml"
CoreClient.create_tei_xml_with_gocr(url, output_tei_xml_file_path, cred_path, title="Sample")
```I0000 00:00:1723108517.494375 11044252 check_gcp_environment_no_op.cc:29] ALTS: Platforms other than Linux and Windows are not supported
## Advanced usage
### OCR with Google Cloud Vision API
#### URL
``` python
from iiif_tei_py.core import IIIFClient
iiif = IIIFClient(cred_path)
iiif_manifest_file_path = "./tmp/02/output.json"
iiif.create_manifest3_by_gocr(url, iiif_manifest_file_path, tmp_dir="./tmp", title="Sample")
```I0000 00:00:1723108518.636803 11044252 check_gcp_environment_no_op.cc:29] ALTS: Platforms other than Linux and Windows are not supported
#### Local file
``` python
iiif_manifest_file_path_local = "./tmp/03/output.json"
input_image_file_path = "./tmp/02/images/demo.jpg"
iiif.create_manifest3_local_by_gocr(input_image_file_path, iiif_manifest_file_path_local, title="Sample")
```I0000 00:00:1723108519.684659 11044252 check_gcp_environment_no_op.cc:29] ALTS: Platforms other than Linux and Windows are not supported
### Convert IIIF manifest to TEI/XML
``` python
from iiif_tei_py.core import TEIClient
teiClient = TEIClient()
output_tei_xml_file_path = "./tmp/04/output.xml"
teiClient.convert_manifest3_annotations_to_zones(iiif_manifest_file_path, output_tei_xml_file_path)
```