https://github.com/simonw/textract-cli
CLI for running files through AWS Textract
https://github.com/simonw/textract-cli
Last synced: 3 months ago
JSON representation
CLI for running files through AWS Textract
- Host: GitHub
- URL: https://github.com/simonw/textract-cli
- Owner: simonw
- License: apache-2.0
- Created: 2024-03-29T17:23:10.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-31T16:42:19.000Z (about 2 years ago)
- Last Synced: 2025-11-19T12:27:57.082Z (7 months ago)
- Language: Python
- Size: 13.7 KB
- Stars: 54
- Watchers: 1
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# textract-cli
[](https://pypi.org/project/textract-cli/)
[](https://github.com/simonw/textract-cli/releases)
[](https://github.com/simonw/textract-cli/actions/workflows/test.yml)
[](https://github.com/simonw/textract-cli/blob/master/LICENSE)
CLI for running files through AWS Textract
## Installation
Install this tool using `pip`:
```bash
pip install textract-cli
```
## Configuration
Any of the [methods for configuring](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html) `boto3` will work with this tool. Environment variables or a `~/.aws/config` file are good options here.
## Usage
To run Textract OCR against a JPEG or PNG file (must be smaller than 5MB):
```bash
textract-cli image.jpeg
```
This will output to standard out. To save to a file use this:
```bash
textract-cli image.jpeg > output.txt
```
Or use the `-o/--output` option like this:
```bash
textract-cli image.jpeg -o output.txt
```
For help, run:
```bash
textract-cli --help
```
You can also use:
```bash
python -m textract_cli --help
```
## Alternatives
[amazon-textract-textractor](https://aws-samples.github.io/amazon-textract-textractor/commandline.html) an Amazon project offering a similar but much more comprehensive CLI.
## Development
To contribute to this tool, first checkout the code. Then create a new virtual environment:
```bash
cd textract-cli
python -m venv venv
source venv/bin/activate
```
Now install the dependencies and test dependencies:
```bash
pip install -e '.[test]'
```
To run the tests:
```bash
pytest
```