https://github.com/defi0x1/google-vision-api
Text Annotation Using Google Vision API
https://github.com/defi0x1/google-vision-api
google-vision-api text-annotation
Last synced: 12 months ago
JSON representation
Text Annotation Using Google Vision API
- Host: GitHub
- URL: https://github.com/defi0x1/google-vision-api
- Owner: defi0x1
- License: mit
- Created: 2020-04-07T08:31:25.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2023-10-03T00:31:23.000Z (over 2 years ago)
- Last Synced: 2025-06-13T04:45:39.595Z (12 months ago)
- Topics: google-vision-api, text-annotation
- Language: Python
- Homepage:
- Size: 493 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Use Google Vision API to extract Text Annotations
## Description
Here Script use `Google Vision api` to extract `Text Annotations` in images.
## Requirement
* Python 3.x
* Credentials
## Setup
To install necessary library, simply use pip:
```bash
pip install google-cloud-vision
```
or,
```bash
pip install -r requirements.txt
```
Next, set up to authenticate with the Cloud Vision API using your project's service account credentials. See the [Vision API Client Libraries](https://cloud.google.com/vision/docs/libraries) for more information. Then, set the GOOGLE_APPLICATION_CREDENTIALS environment variable to point to your downloaded service account credentials:
```bash
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/credentials-key.json
```
## Quick Start: Running script
Text Detection
```bash
~ python vision.py --images ./images
```
Test Result
```bash
~ python test.py --gt ./ --output ./image_test --number_test 2
```
Convert to Pascal VOC data format
```bash
~ python convert_pascal_format.py --output output --input images --gt_path gt.pkl
```
# Result
## Words Annotations

## Character Annotations

## Image Suggestion Resizing
To enable accurate image detection within the Google Cloud Vision API, images should generally be a minimum of 640 x 480 pixels (about 300k pixels). Full details for different types of Vision API Feature requests are shown below:
| Vision API Feature | Recommended Size | Notes |
|---|---|---|
| FACE_DETECTION | 1600 x 1200 | Distance between eyes is most important |
| LANDMARK_DETECTION | 640 x 480 | |
| LOGO_DETECTION | 640 x 480 | |
| LABEL_DETECTION | 640 x 480 | |
| TEXT_DETECTION | 1024 x 768 | OCR requires more resolution to detect characters |
| SAFE_SEARCH_DETECTION | 640 x 480 | |