{"id":13469262,"url":"https://github.com/atlanhq/camelot","last_synced_at":"2026-01-14T19:59:43.768Z","repository":{"id":41176322,"uuid":"61431113","full_name":"atlanhq/camelot","owner":"atlanhq","description":"Camelot: PDF Table Extraction for Humans","archived":true,"fork":false,"pushed_at":"2023-01-05T15:25:42.000Z","size":16826,"stargazers_count":3712,"open_issues_count":118,"forks_count":361,"subscribers_count":78,"default_branch":"master","last_synced_at":"2025-12-19T16:38:42.501Z","etag":null,"topics":["extract","for-humans","pdf","table"],"latest_commit_sha":null,"homepage":"https://camelot-py.readthedocs.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/atlanhq.png","metadata":{"files":{"readme":"README.md","changelog":"HISTORY.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-06-18T11:48:49.000Z","updated_at":"2025-12-19T12:18:02.000Z","dependencies_parsed_at":"2023-02-04T07:00:54.869Z","dependency_job_id":null,"html_url":"https://github.com/atlanhq/camelot","commit_stats":null,"previous_names":["socialcopsdev/camelot"],"tags_count":18,"template":false,"template_full_name":null,"purl":"pkg:github/atlanhq/camelot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atlanhq%2Fcamelot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atlanhq%2Fcamelot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atlanhq%2Fcamelot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atlanhq%2Fcamelot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/atlanhq","download_url":"https://codeload.github.com/atlanhq/camelot/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atlanhq%2Fcamelot/sbom","scorecard":{"id":214705,"data":{"date":"2025-08-11","repo":{"name":"github.com/atlanhq/camelot","commit":"cd8ac7979fe3631866fe439f07e9d6aaa5b1e5c6"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":1.8,"checks":[{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Maintained","score":0,"reason":"project is archived","details":["Warn: Repository is archived."],"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Code-Review","score":1,"reason":"Found 3/22 approved changesets -- score normalized to 1","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":9,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Warn: project license file does not contain an FSF or OSI license."],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":0,"reason":"19 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: PYSEC-2021-856 / GHSA-5545-2q6w-2gh6","Warn: Project is vulnerable to: GHSA-6p56-wp2h-9hxr","Warn: Project is vulnerable to: PYSEC-2019-108 / GHSA-9fq2-x9r6-wfmf","Warn: Project is vulnerable to: PYSEC-2021-857 / GHSA-f7c7-j99h-c22f","Warn: Project is vulnerable to: GHSA-fpfv-jqm9-f5jm","Warn: Project is vulnerable to: GHSA-3448-vrgh-85xr","Warn: Project is vulnerable to: GHSA-8849-5h85-98qw","Warn: Project is vulnerable to: GHSA-fm39-cw8h-3p63","Warn: Project is vulnerable to: GHSA-fw99-f933-rgh8","Warn: Project is vulnerable to: GHSA-hxfw-jm98-v4mq","Warn: Project is vulnerable to: GHSA-jggw-2q6g-c3m6","Warn: Project is vulnerable to: GHSA-m6vm-8g8v-xfjh","Warn: Project is vulnerable to: GHSA-q799-q27x-vp7w","Warn: Project is vulnerable to: GHSA-qr4w-53vh-m672","Warn: Project is vulnerable to: GHSA-x3rm-644h-67m8","Warn: Project is vulnerable to: PYSEC-2023-183","Warn: Project is vulnerable to: PYSEC-2020-73","Warn: Project is vulnerable to: GHSA-jrm6-h9cq-8gqw","Warn: Project is vulnerable to: PYSEC-2022-194 / GHSA-xcjx-m2pj-8g79"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 12 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-17T01:26:26.556Z","repository_id":41176322,"created_at":"2025-08-17T01:26:26.556Z","updated_at":"2025-08-17T01:26:26.556Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28434059,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T18:57:19.464Z","status":"ssl_error","status_checked_at":"2026-01-14T18:52:48.501Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["extract","for-humans","pdf","table"],"created_at":"2024-07-31T15:01:31.074Z","updated_at":"2026-01-14T19:59:43.752Z","avatar_url":"https://github.com/atlanhq.png","language":"Python","funding_links":[],"categories":["Python","Table detection","数据读写与提取","Data Extraction"],"sub_categories":["Form Segmentation"],"readme":"\u003cp align=\"center\"\u003e\n   \u003cimg src=\"https://raw.githubusercontent.com/camelot-dev/camelot/master/docs/_static/camelot.png\" width=\"200\"\u003e\n\u003c/p\u003e\n\n# Camelot: PDF Table Extraction for Humans\n\n[![Build Status](https://travis-ci.org/camelot-dev/camelot.svg?branch=master)](https://travis-ci.org/camelot-dev/camelot) [![Documentation Status](https://readthedocs.org/projects/camelot-py/badge/?version=master)](https://camelot-py.readthedocs.io/en/master/)\n [![codecov.io](https://codecov.io/github/camelot-dev/camelot/badge.svg?branch=master\u0026service=github)](https://codecov.io/github/camelot-dev/camelot?branch=master)\n [![image](https://img.shields.io/pypi/v/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/l/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/pyversions/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![Gitter chat](https://badges.gitter.im/camelot-dev/Lobby.png)](https://gitter.im/camelot-dev/Lobby)\n[![image](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)\n\n\n**Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files!\n\n**Note:** You can also check out [Excalibur](https://github.com/camelot-dev/excalibur), which is a web interface for Camelot!\n\n---\n\n**Here's how you can extract tables from PDF files.** Check out the PDF used in this example [here](https://github.com/camelot-dev/camelot/blob/master/docs/_static/pdf/foo.pdf).\n\n\u003cpre\u003e\n\u003e\u003e\u003e import camelot\n\u003e\u003e\u003e tables = camelot.read_pdf('foo.pdf')\n\u003e\u003e\u003e tables\n\u0026lt;TableList n=1\u0026gt;\n\u003e\u003e\u003e tables.export('foo.csv', f='csv', compress=True) # json, excel, html, sqlite\n\u003e\u003e\u003e tables[0]\n\u0026lt;Table shape=(7, 7)\u0026gt;\n\u003e\u003e\u003e tables[0].parsing_report\n{\n    'accuracy': 99.02,\n    'whitespace': 12.24,\n    'order': 1,\n    'page': 1\n}\n\u003e\u003e\u003e tables[0].to_csv('foo.csv') # to_json, to_excel, to_html, to_sqlite\n\u003e\u003e\u003e tables[0].df # get a pandas DataFrame!\n\u003c/pre\u003e\n\n| Cycle Name | KI (1/km) | Distance (mi) | Percent Fuel Savings |                 |                 |                |\n|------------|-----------|---------------|----------------------|-----------------|-----------------|----------------|\n|            |           |               | Improved Speed       | Decreased Accel | Eliminate Stops | Decreased Idle |\n| 2012_2     | 3.30      | 1.3           | 5.9%                 | 9.5%            | 29.2%           | 17.4%          |\n| 2145_1     | 0.68      | 11.2          | 2.4%                 | 0.1%            | 9.5%            | 2.7%           |\n| 4234_1     | 0.59      | 58.7          | 8.5%                 | 1.3%            | 8.5%            | 3.3%           |\n| 2032_2     | 0.17      | 57.8          | 21.7%                | 0.3%            | 2.7%            | 1.2%           |\n| 4171_1     | 0.07      | 173.9         | 58.1%                | 1.6%            | 2.1%            | 0.5%           |\n\nThere's a [command-line interface](https://camelot-py.readthedocs.io/en/master/user/cli.html) too!\n\n**Note:** Camelot only works with text-based PDFs and not scanned documents. (As Tabula [explains](https://github.com/tabulapdf/tabula#why-tabula), \"If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based\".)\n\n## Why Camelot?\n\n- **You are in control.**: Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (This is important since everything in the real world, including PDF table extraction, is fuzzy.)\n- *Bad* tables can be discarded based on **metrics** like accuracy and whitespace, without ever having to manually look at each table.\n- Each table is a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873).\n- **Export** to multiple formats, including JSON, Excel, HTML and Sqlite.\n\nSee [comparison with other PDF table extraction libraries and tools](https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools).\n\n## Installation\n\n### Using conda\n\nThe easiest way to install Camelot is to install it with [conda](https://conda.io/docs/), which is a package manager and  environment management system for the [Anaconda](http://docs.continuum.io/anaconda/) distribution.\n\n\u003cpre\u003e\n$ conda install -c conda-forge camelot-py\n\u003c/pre\u003e\n\n### Using pip\n\nAfter [installing the dependencies](https://camelot-py.readthedocs.io/en/master/user/install-deps.html) ([tk](https://packages.ubuntu.com/trusty/python-tk) and [ghostscript](https://www.ghostscript.com/)), you can simply use pip to install Camelot:\n\n\u003cpre\u003e\n$ pip install camelot-py[cv]\n\u003c/pre\u003e\n\n### From the source code\n\nAfter [installing the dependencies](https://camelot-py.readthedocs.io/en/master/user/install.html#using-pip), clone the repo using:\n\n\u003cpre\u003e\n$ git clone https://www.github.com/camelot-dev/camelot\n\u003c/pre\u003e\n\nand install Camelot using pip:\n\n\u003cpre\u003e\n$ cd camelot\n$ pip install \".[cv]\"\n\u003c/pre\u003e\n\n## Documentation\n\nGreat documentation is available at [http://camelot-py.readthedocs.io/](http://camelot-py.readthedocs.io/).\n\n## Development\n\nThe [Contributor's Guide](https://camelot-py.readthedocs.io/en/master/dev/contributing.html) has detailed information about contributing code, documentation, tests and more. We've included some basic information in this README.\n\n### Source code\n\nYou can check the latest sources with:\n\n\u003cpre\u003e\n$ git clone https://www.github.com/camelot-dev/camelot\n\u003c/pre\u003e\n\n### Setting up a development environment\n\nYou can install the development dependencies easily, using pip:\n\n\u003cpre\u003e\n$ pip install camelot-py[dev]\n\u003c/pre\u003e\n\n### Testing\n\nAfter installation, you can run tests using:\n\n\u003cpre\u003e\n$ python setup.py test\n\u003c/pre\u003e\n\n## Versioning\n\nCamelot uses [Semantic Versioning](https://semver.org/). For the available versions, see the tags on this repository. For the changelog, you can check out [HISTORY.md](https://github.com/camelot-dev/camelot/blob/master/HISTORY.md).\n\n## License\n\nThis project is licensed under the MIT License, see the [LICENSE](https://github.com/camelot-dev/camelot/blob/master/LICENSE) file for details.\n\n\u003cimg src=\"https://user-images.githubusercontent.com/408863/66741678-a78ab780-ee93-11e9-8d90-b274af222339.png\" align=\"centre\" /\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatlanhq%2Fcamelot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fatlanhq%2Fcamelot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatlanhq%2Fcamelot/lists"}