Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nakamura196/iiif_tei_py

Creating TEI/XML with results from OCR using Google Cloud Vision API
https://github.com/nakamura196/iiif_tei_py

iiif ocr python tei

Last synced: 13 days ago
JSON representation

Creating TEI/XML with results from OCR using Google Cloud Vision API

Awesome Lists containing this project

README

        

iiif_tei_py
================

Creating TEI/XML with results from OCR using Google Cloud Vision API

## Install

``` sh
pip install https://github.com/nakamura196/iiif_tei_py
```

## Basic usage

### Prepare `.env` file

``` txt:.env
GOOGLE_APPLICATION_CREDENTIALS=your-google-credentials.json
```

### Prepare Google Cloud Vision API credentials

``` python
from iiif_tei_py.core import CoreClient
cred_path = CoreClient.load_env()
```

### Main

``` python
url = "https://iiif.io/api/presentation/2.1/example/fixtures/resources/page1-full.png"
output_tei_xml_file_path = "./tmp/01/output.xml"
CoreClient.create_tei_xml_with_gocr(url, output_tei_xml_file_path, cred_path, title="Sample")
```

I0000 00:00:1723108517.494375 11044252 check_gcp_environment_no_op.cc:29] ALTS: Platforms other than Linux and Windows are not supported

## Advanced usage

### OCR with Google Cloud Vision API

#### URL

``` python
from iiif_tei_py.core import IIIFClient
iiif = IIIFClient(cred_path)
iiif_manifest_file_path = "./tmp/02/output.json"
iiif.create_manifest3_by_gocr(url, iiif_manifest_file_path, tmp_dir="./tmp", title="Sample")
```

I0000 00:00:1723108518.636803 11044252 check_gcp_environment_no_op.cc:29] ALTS: Platforms other than Linux and Windows are not supported

#### Local file

``` python
iiif_manifest_file_path_local = "./tmp/03/output.json"
input_image_file_path = "./tmp/02/images/demo.jpg"
iiif.create_manifest3_local_by_gocr(input_image_file_path, iiif_manifest_file_path_local, title="Sample")
```

I0000 00:00:1723108519.684659 11044252 check_gcp_environment_no_op.cc:29] ALTS: Platforms other than Linux and Windows are not supported

### Convert IIIF manifest to TEI/XML

``` python
from iiif_tei_py.core import TEIClient
teiClient = TEIClient()
output_tei_xml_file_path = "./tmp/04/output.xml"
teiClient.convert_manifest3_annotations_to_zones(iiif_manifest_file_path, output_tei_xml_file_path)
```