https://github.com/sped0n/texifast
Fast LaTeX OCR powered by texify, but without bloated dependencies like torch or transformers.
https://github.com/sped0n/texifast
latex-ocr ocr onnxruntime texify
Last synced: 4 months ago
JSON representation
Fast LaTeX OCR powered by texify, but without bloated dependencies like torch or transformers.
- Host: GitHub
- URL: https://github.com/sped0n/texifast
- Owner: Sped0n
- License: mit
- Created: 2024-10-11T16:52:16.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-22T12:03:51.000Z (over 1 year ago)
- Last Synced: 2025-12-01T22:11:58.249Z (6 months ago)
- Topics: latex-ocr, ocr, onnxruntime, texify
- Language: Python
- Homepage: https://pypi.org/project/texifast/
- Size: 1.06 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# texifast   
LaTeX and markdown OCR powered by [texify](https://github.com/VikParuchuri/texify), without bloated dependencies like torch or transformers.
## Features
- Minimal dependency graph
- Compared to [Optimum](https://github.com/huggingface/optimum), texifast is faster (~20%) and has a smaller memory footprint (~20%). For details, see [benchmark](https://github.com/Sped0n/texifast/tree/main/benchmark).
- Supports IOBinding features of ONNXRuntime and optimizes for CUDAExecutionProvider.
- Supports quantized/mixed precision models.
## Installation
You must implicitly specify the required dependencies.
```
pip install texifast[cpu]
# or if you want to use CUDAExecutionProvider
pip install texifast[gpu]
```
> ⚠️⚠️⚠️
>
> **Do not install with** `pip install texifast` **!!!**
## Quickstart
This quick start use the [image in test folder](https://raw.githubusercontent.com/Sped0n/texifast/main/tests/latex.png), you can use whatever you like.
```python
from texifast.model import TxfModel
from texifast.pipeline import TxfPipeline
model = TxfModel(
encoder_model_path="./encoder_model_quantized.onnx",
decoder_model_path="./decoder_model_merged_quantized.onnx",
)
texifast = TxfPipeline(model=model, tokenizer="./tokenizer.json")
print(texifast("./latex.png"))
```
> You can download the quantized ONNX model [here](https://huggingface.co/Spedon/texify-quantized-onnx/tree/main) and the FP16 ONNX model [here](https://huggingface.co/Spedon/texify-fp16-onnx/tree/main).
## API
The full Python API documentation can be found [here](https://github.com/Sped0n/texifast/tree/main/docs).
## Credits
- https://github.com/VikParuchuri/texify
- https://github.com/MosRat/MixTex-rs
- https://github.com/xenova/transformers.js
- https://onnxruntime.ai/docs/api/python/api_summary.html
- https://github.com/ml-tooling/lazydocs