Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/inquilabee/tablecv
TableCV: Table extraction from images made easy.
https://github.com/inquilabee/tablecv
opencv opencv-python opencv-table opencv-table-extraction python table table-extract table-extract-python table-extraction
Last synced: about 2 months ago
JSON representation
TableCV: Table extraction from images made easy.
- Host: GitHub
- URL: https://github.com/inquilabee/tablecv
- Owner: inquilabee
- License: mit
- Created: 2023-09-14T20:52:58.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-09-14T21:38:37.000Z (over 1 year ago)
- Last Synced: 2024-03-21T11:48:06.320Z (10 months ago)
- Topics: opencv, opencv-python, opencv-table, opencv-table-extraction, python, table, table-extract, table-extract-python, table-extraction
- Language: Python
- Homepage:
- Size: 107 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# TableCV
**TableCV** is a Python package designed to extract tables from images. It offers two approaches for extracting tables, allowing you to choose the one that best suits your needs.
## Installation
You can easily install **TableCV** using pip:
```bash
pip install tablecv
```## Usage
### Approach 1 (using PaddleOCR)
**TableCV** offers a straightforward method to extract tables using PaddleOCR. This approach returns a pandas DataFrame object:
```python
from tablecv import extract_table# Replace "image_path" with the path to your image
print(extract_table(image_path="your_image.png"))
```### Approach 2 (OCR with Your Preferred Tool)
If you prefer using a different OCR tool like EasyOCR, KerasOCR, or any other OCR solution, you can still use **TableCV**. First, perform OCR on your image using your chosen tool. The OCR results should be structured as a list of tuples, each containing a bounding box and corresponding text:
```python
# List of tuples: (bounding box as (x, y, w, h), text)
ocr_results = [
((1, 2, 3, 4), "a"),
((4, 5, 6, 7), "b"),
# Add more tuples as needed
]
```After obtaining your OCR results, you can extract tables from them using **TableCV**:
```python
from tablecv import extract_table_from_ocr# Replace "ocr_results" with your OCR results list
print(extract_table_from_ocr(ocr_results))
```With these two approaches, **TableCV** provides flexibility for table extraction from images, whether you prefer using PaddleOCR or another OCR tool of your choice.