https://github.com/abhirooptalasila/rc-text-detection
Detect text on a processed image of a RC using AWS Rekognition
https://github.com/abhirooptalasila/rc-text-detection
aws clahe ocr regex rekognition text-detection text-processing
Last synced: 3 months ago
JSON representation
Detect text on a processed image of a RC using AWS Rekognition
- Host: GitHub
- URL: https://github.com/abhirooptalasila/rc-text-detection
- Owner: abhirooptalasila
- Created: 2020-05-01T10:29:38.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-05-01T10:30:43.000Z (about 5 years ago)
- Last Synced: 2025-01-20T06:42:09.012Z (5 months ago)
- Topics: aws, clahe, ocr, regex, rekognition, text-detection, text-processing
- Language: Jupyter Notebook
- Size: 243 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Text Detection on a Registration Certificate
A registration certificate or RC is an official document stating that your vehicle is registered with the Indian government. I've used [AWS Rekognition](https://aws.amazon.com/rekognition/) to detect text after performing some preprocessing steps.
The task is to recognize the following fields from a picture of a RC.
- License plate number or Regn number
- VIN number or Chassis number (typically 17 digit long)
- Name
- Engine number
- Registration date
- Mfg. date### Unprocessed and Processed Picture
![]()
### Output
```python
{'license_plate': 'DL2CAU7997',
'reg_date': '02/05/2015',
'chass_num': 'MA3FLEB1S00309631',
'eng_num': 'D13A2554860',
'name': 'MANJEET SINGH'}
```---
[```pre_process.py```](pre_process.py) contains the script to process images where I first convert the image into LAB color space and then apply the **Contrast Limited Adaptive Histogram Equalization (CLAHE)** method on it.The LAB color space goes about defining colors differently. Whereas RGB defines color by a combination of red, green, and blue values of different shades, LAB uses three different channels. They are: Lightness, something called the A Channel, and the B Channel. Hence, Lightness, A Channel, and B Channel are shortened to L-A-B, LAB.
Adaptive histogram equalization (AHE) is a computer image processing technique used to improve contrast in images. It differs from ordinary histogram equalization in the respect that the adaptive method computes several histograms, each corresponding to a distinct section of the image, and uses them to redistribute the lightness values of the image. It is therefore suitable for improving the local contrast and enhancing the definitions of edges in each region of an image.
However, **AHE has a tendency to overamplify noise** in relatively homogeneous regions of an image. A variant of adaptive histogram equalization called contrast limited adaptive histogram equalization **(CLAHE) prevents this by limiting the amplification**.
These images are then uploaded to AWS.
```bash
aws lambda invoke --function-name "mynewtext" --log-type Tail --payload file://~Downloads/RC/data.json ~/Downloads/RC/output.json
```[```Get Details.ipynb```](Get%20Details.ipynb) uses **Regex** to extract the required fields from the JSON file and store them in a dictionary as a key-value pair