Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/houqp/leptess
Productive and safe Rust binding for leptonica and tesseract
https://github.com/houqp/leptess
Last synced: 12 days ago
JSON representation
Productive and safe Rust binding for leptonica and tesseract
- Host: GitHub
- URL: https://github.com/houqp/leptess
- Owner: houqp
- License: mit
- Created: 2019-05-28T04:04:13.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2024-01-19T22:44:10.000Z (10 months ago)
- Last Synced: 2024-10-17T06:55:11.455Z (24 days ago)
- Language: Rust
- Homepage: https://houqp.github.io/leptess/leptess/index.html
- Size: 50.1 MB
- Stars: 258
- Watchers: 9
- Forks: 28
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Leptess
[![Test](https://github.com/houqp/leptess/actions/workflows/test.yml/badge.svg)](https://github.com/houqp/leptess/actions/workflows/test.yml)
[![Crates.io](https://img.shields.io/crates/v/leptess.svg)](https://crates.io/crates/leptess)
[![Docs](https://img.shields.io/badge/rust-docs-blue.svg)](https://houqp.github.io/leptess/leptess/index.html)Productive and safe Rust bindings/wrappers for Tesseract and Leptonica.
## Build dependencies
Make sure you have clang, Leptonica and Tesseract installed.
Tesseract should be version 4.0.0 or above.
### Ubuntu
```bash
sudo apt-get install libleptonica-dev libtesseract-dev clang
```You will also need to install tesseract language data based on your OCR needs:
```bash
sudo apt-get install tesseract-ocr-eng
```### Mac
```bash
brew install tesseract leptonica pkg-config
```### Windows
On Windows, this library uses Microsoft's [vcpkg](https://github.com/microsoft/vcpkg) to provide tesseract.
Please install [vcpkg](https://github.com/microsoft/vcpkg) and **set up user wide integration** or [vcpkg crate](https://crates.io/crates/vcpkg) won't be able to find the library.
To install tesseract:
```cmd
REM from the vcpkg directoryREM 32 bit
.\vcpkg install tesseract:x86-windowsREM 64 bit
.\vcpkg install tesseract:x64-windows
```To run the tests configure vcpkg-crate to find the tesseract library:
```cmd
SET VCPKGRS_DYNAMIC=true
cargo test
```## Usage
```rust
let mut lt = leptess::LepTess::new(None, "eng").unwrap();
lt.set_image("path/to/page.bmp");
println!("{}", lt.get_utf8_text().unwrap());
```For more examples, see [docs](https://houqp.github.io/leptess/leptess/index.html) and `examples` directory.
To run demos in `examples` directory, try:
```bash
cargo run --example low_level_ocr_full_page
```## Development
To run tests, you will need at Tesseract 4.x or 5.x to match what we have in
`tests/tessdata/eng.traineddata`. See GitHub config actions to see how to replicate
the setup.