https://github.com/sivakumar-mahalingam/fastmrz
⚡Extracting the Machine Readable Zone (MRZ) from passport or any document images
https://github.com/sivakumar-mahalingam/fastmrz
identity-document mrz mrz-scanner ocr opencv opencv-python passport passport-mrz python tesseract tesseract-ocr text-recognition
Last synced: 3 months ago
JSON representation
⚡Extracting the Machine Readable Zone (MRZ) from passport or any document images
- Host: GitHub
- URL: https://github.com/sivakumar-mahalingam/fastmrz
- Owner: sivakumar-mahalingam
- License: agpl-3.0
- Created: 2024-03-17T11:13:21.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-07T14:38:57.000Z (4 months ago)
- Last Synced: 2025-03-29T19:01:26.370Z (3 months ago)
- Topics: identity-document, mrz, mrz-scanner, ocr, opencv, opencv-python, passport, passport-mrz, python, tesseract, tesseract-ocr, text-recognition
- Language: Python
- Homepage: https://pypi.org/project/fastmrz
- Size: 67.7 MB
- Stars: 59
- Watchers: 4
- Forks: 13
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# Fast MRZ
[](https://github.com/sivakumar-mahalingam/fastmrz/blob/main/LICENSE)
[](https://pypistats.org/packages/fastmrz)

[](https://github.com/sivakumar-mahalingam/fastmrz/actions/workflows/codeql.yml)
[](https://pypi.org/project/fastmrz/)FastMRZ is an open-source Python package that extracts the Machine Readable Zone (MRZ) from passports and other documents. FastMRZ accepts various input formats such as Image, Base64 string, MRZ string, or NumPy array.
[Features](#features) •
[Built With](#built-with) •
[Prerequisites](#prerequisites) •
[Installation](#installation) •
[Example](#example) •
[Wiki](#wiki) •
[ToDo](#todo) •
[Contributing](#contributing)## ️✨Features
- 👁️Detects and extracts the MRZ region from document images
- ️🔍Contour detection to accurately identify the MRZ area
- 🎨Custom trained models using ONNX
- 🆗Contains checksum logics for data validation
- 📤Outputs the extracted MRZ region as text/json## 🛠️Built With



## 🚨Prerequisites
- Install [Tesseract OCR](https://tesseract-ocr.github.io/tessdoc/Installation.html) engine. And set `PATH` variable with the executable and ensure that tesseract can be reached from the command line.## ⚙️Installation
1. Install `fastmrz`
```bash
pip install fastmrz
```
This can be done through conda too if you prefer.```bash
conda create -n fastmrz tesseract -c conda-forge
conda activate fastmrz
```2. Copy the `mrz.traineddata` file from the `tessdata` folder of the [repository](https://github.com/sivakumar-mahalingam/fastmrz/raw/main/tessdata/mrz.traineddata) into the `tessdata` folder of the Tesseract installation on **YOUR MACHINE**
## 💡Example
```Python
from fastmrz import FastMRZ
import jsonfast_mrz = FastMRZ()
# Pass file path of installed Tesseract OCR, incase if not added to PATH variable
# fast_mrz = FastMRZ(tesseract_path=r'/opt/homebrew/Cellar/tesseract/5.3.4_1/bin/tesseract') # Default path in Mac
# fast_mrz = FastMRZ(tesseract_path=r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe') # Default path in Windows
passport_mrz = fast_mrz.get_details("../data/passport_uk.jpg", include_checkdigit=False)
print("JSON:")
print(json.dumps(passport_mrz, indent=4))print("\n")
passport_mrz = fast_mrz.get_details("../data/passport_uk.jpg", ignore_parse=True)
print("TEXT:")
print(passport_mrz)
```**OUTPUT:**
```Console
JSON:
{
"mrz_type": "TD3",
"document_code": "P",
"issuer_code": "GBR",
"surname": "PUDARSAN",
"given_name": "HENERT",
"document_number": "707797979",
"document_number_checkdigit": "2",
"nationality_code": "GBR",
"birth_date": "1995-05-20",
"sex": "M",
"expiry_date": "2017-04-22",
"optional_data": "",
"mrz_text": "P
MRZ Types & FormatThe standard for MRZ code is strictly regulated and has to comply with [Doc 9303](https://www.icao.int/publications/pages/publication.aspx?docnum=9303). Machine Readable Travel Documents published by the International Civil Aviation Organization.
There are currently several types of ICAO standard machine-readable zones, which vary in the number of lines and characters in each line:
- TD-1 (e.g. citizen’s identification card, EU ID card, US Green Card): consists of 3 lines, 30 characters each.
- TD-2 (e.g. Romania ID, old type of German ID), and MRV-B (machine-readable visas type B — e.g. Schengen visa): consists of 2 lines, 36 characters each.
- TD-3 (all international passports, also known as MRP), and MRV-A (machine-readable visas type A — issued by the USA, Japan, China, and others): consist of 2 lines, 44 characters each.Now, based on the example of a national passport, let us take a closer look at the MRZ composition.


## ✅ToDo
- [x] Include mrva and mrvb documents
- [x] Add wiki page
- [x] Support numpy array as input
- [x] Support mrz text as input
- [x] Support base64 as input
- [ ] Support pdf as input
- [x] Function to return mrz text as output
- [ ] Bulk process
- [ ] Add function parameter - Image Enhancement Model
- [ ] Add function parameter - Text Image Enhancement Model
- [ ] Train Tesseract model with additional data
- [x] Add function parameter - include_checkdigit
- [ ] Add function - get_mrz_image
- [x] Add function - validate_mrz
- [ ] Add function - generate_mrz
- [ ] Extract face image
- [ ] Add documentation page
- [ ] Add all checkdigit status## 🤝 Contributing
Contributions are welcome! Here's how you can help:
1. Fork the repository
2. Create a new branch (`git checkout -b feature/amazing-feature`)
3. Make your changes
4. Commit your changes (`git commit -m 'feat: add amazing feature'`)
5. Push to the branch (`git push origin feature/amazing-feature`)
6. Open a Pull Request## ⚖️License
Distributed under the AGPL-3.0 License. See `LICENSE` for more information.
## 🙏Show your support
Give a ⭐️ if this project helped you!
## 🚀Who's Using It?
We’d love to know who’s using **fastmrz**! If your company or project uses this package, feel free to share your story. You can:
- Open an issue with the title "We are using fastmrz!" and include your project or company name.
Thank you for supporting **fastmrz**! 🤟