Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/geo-y20/standard-ocr-

Explore the Standard OCR Project: a deep learning-based character recognition system leveraging advanced computer vision techniques. Detect characters in images using ResNet, Xception, Inception, and MobileNet models. Train, evaluate, and contribute to this cutting-edge technology.
https://github.com/geo-y20/standard-ocr-

css deep-learning deep-neural-networks flask html javascript jupyter-notebook keras ocr-recognition ocr-text-reader python resnet tensorflow xception

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/geo-y20/standard-ocr-
Owner: Geo-y20
Created: 2023-12-31T20:02:54.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2023-12-31T20:41:14.000Z (about 1 year ago)
Last Synced: 2024-11-10T21:35:52.630Z (3 months ago)
Topics: css, deep-learning, deep-neural-networks, flask, html, javascript, jupyter-notebook, keras, ocr-recognition, ocr-text-reader, python, resnet, tensorflow, xception
Language: Jupyter Notebook
Homepage:
Size: 4.32 MB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

Standard OCR Project

Introduction

This project aims to create a highly accurate and efficient model for character recognition in images. The dataset used in this project can be found here. The dataset consists of two sections: Data and Data2, each having training and testing directories with 36 subdirectories representing different character classes. The training data contains 573 images per class, while the testing data includes approximately 88 images per class. Understanding the dataset's structure is crucial for proper organization and analysis.

Problem Statement

The task is a computer vision challenge to detect characters in input images, emphasizing image processing, analysis, and modern deep learning techniques. It's more aligned with computer vision than Optical Character Recognition (OCR), utilizing various models like ResNet, Xception, Inception, and MobileNet to process and analyze the dataset for accurate predictions.

GitHub Repository

The code for this project is hosted on GitHub. You can clone the repository using the following command:

gh repo clone Geo-y20/Standard-OCR-

Solution Framework

Set Up: Importing necessary modules, setting hyperparameters, and constants.

Data Loading: Loading the dataset into memory for processing.

Data Processing: Converting raw data, including techniques like data augmentation, normalization, and resizing images.

Data Visualization: Inspecting the dataset for insights and potential issues.

Backbone Comparison: Comparing different pre-trained backbones to identify the best performer.

Model Building: Constructing a model architecture using selected backbones.

Model Predictions: Evaluating model performance on unseen data, analyzing predictions, and identifying areas for improvement.

Folder Structure

app.py: Contains Flask web application code for image prediction.

templates/: Directory for HTML templates.

static/: Directory for static files (CSS, JS, images).

Getting Started

Install necessary libraries and dependencies.

Ensure Python environment compatibility.

Run pip install -r requirements.txt to install dependencies.

Train and save your model using the provided dataset.

Update app.py with the path to your trained model.

Run the Flask application (python app.py) and navigate to localhost:5000 in your browser.

Usage

Access the application through the browser.

Upload an image containing characters.

Get predictions for the characters present in the image.

Download the Model

To download the H5 model file, click here.

Sample Images

Sample 1
Sample 2