{"id":17153709,"url":"https://github.com/vinodbaste/paddleocr_rec_dec","last_synced_at":"2025-08-31T22:15:21.537Z","repository":{"id":184876104,"uuid":"672196558","full_name":"vinodbaste/paddleOCR_rec_dec","owner":"vinodbaste","description":"Optical Character Recognition (OCR) is a powerful technology that enables machines to recognize and extract text from images or scanned documents. OCR finds applications in various fields, including document digitization, text extraction from images, and text-based data analysis.","archived":false,"fork":false,"pushed_at":"2023-07-30T19:51:32.000Z","size":51,"stargazers_count":19,"open_issues_count":0,"forks_count":4,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-11T13:38:21.749Z","etag":null,"topics":["detection","image-processing","ocr","paddleocr","recognition"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vinodbaste.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-07-29T08:41:20.000Z","updated_at":"2025-03-30T16:33:15.000Z","dependencies_parsed_at":null,"dependency_job_id":"f700f5fc-e553-4a36-9b26-870dd21ea99a","html_url":"https://github.com/vinodbaste/paddleOCR_rec_dec","commit_stats":null,"previous_names":["vinodbaste/paddleocr_rec_dec"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/vinodbaste/paddleOCR_rec_dec","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinodbaste%2FpaddleOCR_rec_dec","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinodbaste%2FpaddleOCR_rec_dec/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinodbaste%2FpaddleOCR_rec_dec/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinodbaste%2FpaddleOCR_rec_dec/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vinodbaste","download_url":"https://codeload.github.com/vinodbaste/paddleOCR_rec_dec/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vinodbaste%2FpaddleOCR_rec_dec/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273047261,"owners_count":25036322,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-31T02:00:09.071Z","response_time":79,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["detection","image-processing","ocr","paddleocr","recognition"],"created_at":"2024-10-14T21:47:05.012Z","updated_at":"2025-08-31T22:15:21.492Z","avatar_url":"https://github.com/vinodbaste.png","language":"Python","funding_links":["https://www.buymeacoffee.com/bastevinod"],"categories":[],"sub_categories":[],"readme":"# paddleOCR for Det and Rec\nOptical Character Recognition (OCR) is a powerful technology that enables machines to recognize and extract text from images or scanned documents. OCR finds applications in various fields, including document digitization, text extraction from images, and text-based data analysis. In this article, we will explore how to use PaddleOCR, an advanced OCR toolkit based on deep learning, for text detection and recognition tasks. We will walk through a code snippet that demonstrates the process step-by-step.\n# Prerequisites\nBefore we dive into the code, let's ensure we have everything set up to run the PaddleOCR library. Make sure you have the following prerequisites installed on your machine:\nPython (3.6 or higher)\nPaddleOCR library\nOther necessary dependencies (e.g., NumPy, pandas, etc)\n\nBefore running the code snippet, make sure you have the necessary libraries installed. You can find all the required packages and their versions in the requirements.txt file\n```python\npip install -r requirements.txt\n```\n# Getting started\nIn the provided ```main.py```, we have an example usage of the RecMain class for performing text recognition (OCR) on a folder of images and generating an output Excel file with evaluation metrics:\n```\n# Example usage: Replace with your image folder path, label file, and desired output file name\nimage_folder = \"path/to/image_folder\"\nlabel_file = \"path/to/txt_file\"\noutput_file = \"raw_output.xlsx\"\n\nif __name__ == \"__main__\":\n    RecMain(image_folder=image_folder, rec_file=label_file, output_file=output_file).run_rec()\n```\n\nsame can be initiated for detection\n```\n#Dec\n# Example usage: Replace with your image folder path, label file, and desired output file name\nimage_folder = \"path/to/image_folder\"\nlabel_file = \"path/to/txt_file\"\noutput_file = \"raw_output.xlsx\"\n\nif __name__ == \"__main__\":\n  DecMain(image_folder_path=image_folder, label_file_path=label_file, output_file=output_file) \\\n  .run_dec()\n```\n\n# 1. Text Detection\nThe code provided is a part of a class named DecMain, which seems to be designed for Optical Character Recognition (OCR) evaluation using ground truth data. It appears to use PaddleOCR to extract text from images and then calculates metrics like precision, recall, and Character Error Rate (CER) to evaluate the performance of the OCR system.\n```python\nclass DecMain:\n    def __init__(self, image_folder_path, label_file_path, output_file):\n        self.image_folder_path = image_folder_path\n        self.label_file_path = label_file_path\n        self.output_file = output_file\n\n    def run_dec(self):\n        # Check and update the ground truth file\n        CheckAndUpdateGroundTruth(self.label_file_path).check_and_update_ground_truth_file()\n\n        df = OcrToDf(image_folder=self.image_folder_path, label_file=self.label_file_path, det=True, rec=True, cls=False).ocr_to_df()\n\n        ground_truth_data = ReadGroundTruthFile(self.label_file_path).read_ground_truth_file()\n\n        # Get the extracted text as a list of dictionaries (representing the OCR results)\n        ocr_results = df.to_dict(orient=\"records\")\n\n        # Calculate precision, recall, and CER\n        precision, recall, total_samples = CalculateMetrics(ground_truth_data, ocr_results).calculate_precision_recall()\n\n        CreateSheet(dataframe=df, precision=precision, recall=recall, total_samples=total_samples,\n                    file_name=self.output_file).create_sheet()\n```\n# Note: Format of the DET Ground Truth Label File\n```\nTo perform OCR evaluation using the DecMain class and the provided code, it's crucial to format the ground truth label file correctly.\nThe label file should be in JSON format and follow the structure as shown below:\n\nimage_name.jpg [{\"transcription\": \"215mm 18\", \"points\": [[199, 6], [357, 6], [357, 33], [199, 33]], \"difficult\": False, \"key_cls\": \"digits\"}, {\"transcription\": \"XZE SA\", \"points\": [[15, 6], [140, 6], [140, 36], [15, 36]], \"difficult\": False, \"key_cls\": \"text\"}]\n\nThe label file should be in JSON format.\nEach line of the file represents an image's OCR ground truth.\nEach line contains the filename of the image, followed by the OCR results for that image in the form of a JSON object.\nThe JSON object should have the following keys:\n\"transcription\": The ground truth text transcription of the image.\n\"points\": A list of four points representing the bounding box coordinates of the text region in the image.\n\"difficult\": A boolean value indicating whether the text region is difficult to recognize.\n\"key_cls\": The class label of the OCR result, e.g., \"digits\" or \"text\".\nMake sure to follow this format while creating the ground truth label file for accurate OCR evaluation.\n```\n# 2. Text Recognition\nThe code provided defines a class named RecMain, which is designed to run text recognition (OCR) using a pre-trained OCR model on a folder of images and generate an evaluation Excel sheet.\n```python\nclass RecMain:\n    def __init__(self, image_folder, rec_file, output_file):\n        self.image_folder = image_folder\n        self.rec_file = rec_file\n        self.output_file = output_file\n\n    def run_rec(self):\n        image_paths = GetImagePathsFromFolder(self.image_folder, self.rec_file). \\\n            get_image_paths_from_folder()\n\n        ocr_model = LoadRecModel().load_model()\n\n        results = ProcessImages(ocr=ocr_model, image_paths=image_paths).process_images()\n\n        ground_truth_data = ConvertTextToDict(self.rec_file).convert_txt_to_dict()\n\n        model_predictions, ground_truth_texts, image_names, precision, recall, \\\n            overall_model_precision, overall_model_recall, cer_data_list = EvaluateRecModel(results,\n                                                                                            ground_truth_data).evaluate_model()\n\n        # Create Excel sheet\n        CreateMetricExcel(image_names, model_predictions, ground_truth_texts,\n                          precision, recall, cer_data_list, overall_model_precision, overall_model_recall,\n                          self.output_file).create_excel_sheet()\n```\n# Note: Format of the Ground Truth Text File\n```\nTo perform OCR evaluation using the RecMain class and the provided code, it's essential to format the ground truth (GT) text file correctly.\nThe GT text file should be in the following format:\n\nimage_name.jpg text\n\nEach line of the file represents an image's GT text.\nEach line contains the filename of the image, followed by a tab character (\\t), and then the GT text for that image.\nEnsure that the GT text file contains GT text entries for all the images present in the image folder specified in the RecMain class. The GT text should match the actual text content present in the images. This format is necessary for accurate evaluation of the OCR model's performance.\n```\n\n**If you find this library useful, please consider starring this repository from the top of this page.**\n[![](https://i.imgur.com/oSLuE0e.png)](#)\n\n# Support my work\n\u003ca href=\"https://www.buymeacoffee.com/bastevinod\" target=\"_blank\"\u003e\u003cimg src=\"https://cdn.buymeacoffee.com/buttons/default-orange.png\" alt=\"Buy Me A Coffee\" height=\"41\" width=\"174\"\u003e\u003c/a\u003e\n\n# License\n```\nCopyright [2023] [Vinod Baste]\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvinodbaste%2Fpaddleocr_rec_dec","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvinodbaste%2Fpaddleocr_rec_dec","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvinodbaste%2Fpaddleocr_rec_dec/lists"}