{"id":26279692,"url":"https://github.com/alephdata/pdflib","last_synced_at":"2025-05-07T03:04:17.436Z","repository":{"id":51388345,"uuid":"129079165","full_name":"alephdata/pdflib","owner":"alephdata","description":"Binary Python bindings for poppler utils for content extraction","archived":false,"fork":false,"pushed_at":"2021-05-12T12:23:17.000Z","size":2442,"stargazers_count":42,"open_issues_count":5,"forks_count":5,"subscribers_count":18,"default_branch":"master","last_synced_at":"2025-05-07T03:03:38.300Z","etag":null,"topics":["pdflib","poppler","python-bindings"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alephdata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-04-11T10:51:29.000Z","updated_at":"2024-01-28T15:55:58.000Z","dependencies_parsed_at":"2022-09-02T06:44:31.172Z","dependency_job_id":null,"html_url":"https://github.com/alephdata/pdflib","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alephdata%2Fpdflib","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alephdata%2Fpdflib/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alephdata%2Fpdflib/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alephdata%2Fpdflib/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alephdata","download_url":"https://codeload.github.com/alephdata/pdflib/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252804206,"owners_count":21806769,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pdflib","poppler","python-bindings"],"created_at":"2025-03-14T14:16:00.402Z","updated_at":"2025-05-07T03:04:17.393Z","avatar_url":"https://github.com/alephdata.png","language":"Python","readme":"pdflib\n-------\n\n[![Build Status](https://travis-ci.org/alephdata/pdflib.svg?branch=master)](https://travis-ci.org/alephdata/pdflib)\n\nPython binding for poppler.\n\n## Installation\n\nUsing pip: `pip install pdflib`\n\nFrom source:\n\n- Clone poppler source code and compile it:\n\n```\ngit clone --branch poppler-0.63.0 --depth 1 https://anongit.freedesktop.org/git/poppler/poppler.git poppler_src\ncd poppler_src/\ncmake -DENABLE_SPLASH=OFF -DBUILD_GTK_TESTS=OFF -DENABLE_UTILS=OFF -DENABLE_LIBOPENJPEG=none .\nmake\n```\n\n- Set `POPPLER_SRC` environment variable\n\n```\nexport POPPLER_ROOT=/pdflib/poppler_src/\n```\n\n- Install cython\n\n```\npip install cython\n```\n\n- Build extension\n\n```\npython setup.py build_ext --inplace\n```\n\n## Usage\n\n```\n\u003e\u003e\u003e from pdflib import Document\n\u003e\u003e\u003e doc = Document(\"path/to/file.pdf\")\n```\n\nGetting metadata\n\n```\n\u003e\u003e\u003e print(doc.metadata)\n\u003e\u003e\u003e print(doc.xmp_metadata)\n```\n\nGetting text content of each page\n\n```\n\u003e\u003e\u003e for page in doc:\n        print(' \\n'.join(page.lines).strip())\n```\n\nGetting images from each page\n\n```\n\u003e\u003e\u003e for page in doc:\n        page.extract_images(path='images', prefix='img')\n```\n\nLICENSE\n-------\npdflib is available under GPL v3 (poppler is GPL).","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falephdata%2Fpdflib","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falephdata%2Fpdflib","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falephdata%2Fpdflib/lists"}