{"id":38049168,"url":"https://github.com/monchin/tablers","last_synced_at":"2026-03-01T08:04:34.117Z","repository":{"id":331341319,"uuid":"1112852394","full_name":"monchin/tablers","owner":"monchin","description":"A blazingly fast PDF table extraction library with python API powered by Rust","archived":false,"fork":false,"pushed_at":"2026-02-23T13:43:59.000Z","size":13783,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-02-23T20:50:41.571Z","etag":null,"topics":["pdf","python","rust","table-extraction"],"latest_commit_sha":null,"homepage":"https://monchin.github.io/tablers/","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/monchin.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-12-09T07:27:48.000Z","updated_at":"2026-02-23T12:32:09.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/monchin/tablers","commit_stats":null,"previous_names":["monchin/tablers"],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/monchin/tablers","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monchin%2Ftablers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monchin%2Ftablers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monchin%2Ftablers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monchin%2Ftablers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/monchin","download_url":"https://codeload.github.com/monchin/tablers/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monchin%2Ftablers/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29964203,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-01T06:55:38.174Z","status":"ssl_error","status_checked_at":"2026-03-01T06:53:04.810Z","response_time":124,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pdf","python","rust","table-extraction"],"created_at":"2026-01-16T20:03:00.326Z","updated_at":"2026-03-01T08:04:34.109Z","avatar_url":"https://github.com/monchin.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Rust-000000?style=for-the-badge\u0026logo=rust\u0026logoColor=white\" alt=\"Rust\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Python-3776AB?style=for-the-badge\u0026logo=python\u0026logoColor=white\" alt=\"Python\"\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003e⚡ Tablers\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eA blazingly fast PDF table extraction library with python API powered by Rust\u003c/strong\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/monchin/tablers/blob/main/LICENSE\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/license-MIT-blue.svg\" alt=\"License: MIT\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/tablers/\"\u003e\n    \u003cimg src=\"https://img.shields.io/pypi/v/tablers.svg\" alt=\"PyPI version\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/tablers/\"\u003e\n    \u003cimg src=\"https://img.shields.io/pypi/pyversions/tablers.svg\" alt=\"Python versions\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://pdm-project.org\"\u003e\n    \u003cimg src=\"https://img.shields.io/endpoint?url=https%3A%2F%2Fcdn.jsdelivr.net%2Fgh%2Fpdm-project%2F.github%2Fbadge.json\" alt=\"pdm-managed\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n## Features\n\n- 🚀 **Blazingly Fast** - Core algorithms written in Rust for maximum performance\n- 🐍 **Pythonic API** - Easy-to-use Python interface with full type hints\n- 📄 **Edge Detection** - Accurate table detection using line and rectangle edge analysis\n- 📝 **Text Extraction** - Extract text content from table cells with configurable settings\n- 📤 **Multiple Export Formats** - Export tables to CSV, Markdown, and HTML\n- 🔐 **Encrypted PDFs** - Support for password-protected PDF documents\n- 💾 **Memory Efficient** - Lazy page loading for handling large PDF files\n- 🖥️ **Cross-Platform** - Works on Windows, Linux, and macOS\n\n## Why Tablers?\n\nThis project draws significant inspiration from the table extraction modules of [pdfplumber](https://github.com/jsvine/pdfplumber) and [PyMuPDF](https://github.com/pymupdf/PyMuPDF). Compared to `pdfplumber` and `PyMuPDF`, `tablers` has the following advantages:\n\n- **High Performance**: Utilizes Rust for high-performance PDF processing\n- **More Configurable**: Supports customizable table filter settings (`min_rows`, `min_columns`, `include_single_cell`, e.g., see [this issue](https://github.com/pymupdf/PyMuPDF/issues/3987))\n- **Clean Python Dependencies**: No external python dependencies required\n\n## Benchmark\n\nPerformance comparison of tablers, pymupdf and pdfplumber for PDF table extraction:\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/monchin/tablers-benchmark/master/table_extraction_benchmark.png\" alt=\"Table Extraction Benchmark\"\u003e\n\u003c/p\u003e\n\nFor more details, please refer to the [tablers-benchmark](https://github.com/monchin/tablers-benchmark) repository.\n\n## Note\n\nThis solution is primarily designed for text-based PDFs and does not support scanned PDFs.\n\n## Installation\n\n```bash\npip install tablers\n```\n\n## Quick Start\n\n### Basic Table Extraction\n\n```python\nfrom tablers import Document, find_tables\n\n# Open a PDF document\ndoc = Document(\"example.pdf\")\n\n# Extract tables from each page\nfor page in doc.pages():\n    tables = find_tables(page, extract_text=True)\n    for table in tables:\n        print(f\"Found table with {len(table.cells)} cells\")\n        for cell in table.cells:\n            print(f\"  Cell: {cell.text} at {cell.bbox}\")\n\ndoc.close()\n```\n\n### Using Context Manager\n\n```python\nfrom tablers import Document, find_tables\n\nwith Document(\"example.pdf\") as doc:\n    page = doc.get_page(0)  # Get first page\n    tables = find_tables(page, extract_text=True)\n\n    for table in tables:\n        print(f\"Table bbox: {table.bbox}\")\n```\n\nFor more advanced usage, please refer to the [documents](https://monchin.github.io/tablers/).\n\n## Requirements\n\n- Python \u003e= 3.10\n- Supported platforms: Windows (x64), Linux (x64) with glibc \u003e= 2.28, macOS (ARM64)\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](https://github.com/monchin/tablers/blob/master/LICENSE) file for details.\n\n## Acknowledgments\n\n- [pdfplumber](https://github.com/jsvine/pdfplumber) - PDF parsing library\n- [PyMuPDF](https://github.com/pymupdf/PyMuPDF) - PDF parsing library\n- [pdfium-render](https://github.com/ajrcarey/pdfium-render) - Rust bindings for PDFium\n- [PyO3](https://github.com/PyO3/pyo3) - Rust bindings for Python\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmonchin%2Ftablers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmonchin%2Ftablers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmonchin%2Ftablers/lists"}