{"id":26333038,"url":"https://github.com/astrabert/pdfitdown","last_synced_at":"2026-01-31T22:10:27.221Z","repository":{"id":270408673,"uuid":"910297017","full_name":"AstraBert/PdfItDown","owner":"AstraBert","description":"Convert Everything to PDF","archived":false,"fork":false,"pushed_at":"2025-12-31T16:01:03.000Z","size":9127,"stargazers_count":214,"open_issues_count":0,"forks_count":24,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-26T10:46:12.860Z","etag":null,"topics":["csv","docx","html","json","markdown","package","pdf","pdf-conversion","powerpoint","pypi","python","xml"],"latest_commit_sha":null,"homepage":"https://pdfitdown.eu","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AstraBert.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-12-30T23:06:01.000Z","updated_at":"2026-01-25T05:54:58.000Z","dependencies_parsed_at":"2024-12-30T23:47:56.437Z","dependency_job_id":"17a15885-9a45-4f48-b49c-f257ba1c4689","html_url":"https://github.com/AstraBert/PdfItDown","commit_stats":null,"previous_names":["astrabert/pdfitdown"],"tags_count":23,"template":false,"template_full_name":null,"purl":"pkg:github/AstraBert/PdfItDown","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraBert%2FPdfItDown","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraBert%2FPdfItDown/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraBert%2FPdfItDown/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraBert%2FPdfItDown/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AstraBert","download_url":"https://codeload.github.com/AstraBert/PdfItDown/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraBert%2FPdfItDown/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28957155,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-31T18:30:42.805Z","status":"ssl_error","status_checked_at":"2026-01-31T18:30:19.593Z","response_time":128,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","docx","html","json","markdown","package","pdf","pdf-conversion","powerpoint","pypi","python","xml"],"created_at":"2025-03-15T23:37:43.390Z","updated_at":"2026-01-31T22:10:27.216Z","avatar_url":"https://github.com/AstraBert.png","language":"Python","funding_links":["https://github.com/sponsors/AstraBert"],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\u003ch1\u003ePdfItDown\u003c/h1\u003e\n\u003ch2\u003eConvert Everything to PDF\u003c/h2\u003e\n\u003c/div\u003e\n\u003cbr\u003e\n\u003cdiv align=\"center\"\u003e\n    \u003ca href=\"https://discord.gg/AXcVf269\"\u003e\u003cimg src=\"https://img.shields.io/badge/Discord-%235865F2.svg?style=for-the-badge\u0026logo=discord\u0026logoColor=white\" alt=\"Join Discord Server\" width=200 height=60\u003e\u003c/a\u003e\n\u003c/div\u003e\n\u003cbr\u003e\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"https://raw.githubusercontent.com/AstraBert/PdfItDown/main/img/logo.png\" alt=\"PdfItDown Logo\"\u003e\n\u003c/div\u003e\n\n**PdfItDown** is a python package that relies on [`markitdown` by Microsoft](https://github.com/microsoft/markitdown/), [`markdown_pdf`](https://github.com/vb64/markdown-pdf) and [img2pdf](https://pypi.org/project/img2pdf/). Visit us on our [documentation website](https://pdfitdown.eu)!\n\n### Applicability\n\n**PdfItDown** is applicable to the following file formats:\n\n- Markdown\n- PowerPoint\n- Word\n- Excel\n- HTML\n- Text-based formats (CSV, XML, JSON)\n- ZIP files (iterates over contents)\n- Image files (PNG, JPG)\n\nThe format-specific support needs to be evaluated for the specific reader you are using.\n\n### How does it work?\n\n**PdfItDown** works in a very simple way:\n\n- From **markdown** to PDF (default)\n\n```mermaid\ngraph LR\n2(Input File) --\u003e 3[Markdown content]\n3[Markdown content] --\u003e 4[markdown-pdf]\n4[markdown-pdf] --\u003e 5(PDF file)\n```\n\n- From **image** to PDF (default)\n\n```mermaid\ngraph LR\n2(Input File) --\u003e 3[Bytes]\n3[Bytes] --\u003e 4[img2pdf]\n4[img2pdf] --\u003e 5(PDF file)\n```\n\n- From other **text-based** file formats or **unstructured** file formats to PDF (default)\n\n```mermaid\ngraph LR\n2(Input File) --\u003e  3[MarkitDown]\n3[MarkitDown] --\u003e  4[Markdown content]\n4[Markdown content] --\u003e 5[markdown-pdf]\n5[markdown-pdf] --\u003e 6(PDF file)\n```\n\n- Using a **custom conversion callback**\n\n```mermaid\ngraph LR\n2(Input File) --\u003e  3[Conversion Callback]\n3[Conversion Callback] --\u003e 4(PDF file)\n```\n\n### Installation and Usage\n\nTo install **PdfItDown**, just run:\n\n```bash\npip install pdfitdown\n```\n\nYou can now use the **command line tool**:\n\n```\nUsage: pdfitdown [OPTIONS]\n\n  Convert (almost) everything to PDF\n\nOptions:\n  -i, --inputfile TEXT   Path to the input file(s) that need to be converted\n                         to PDF. Can be used multiple times.\n  -o, --outputfile TEXT  Path to the output PDF file(s). If more than one\n                         input file is provided, you should provide an equal\n                         number of output files.\n  -t, --title TEXT       Title to include in the PDF metadata. Default: 'File\n                         Converted with PdfItDown'. If more than one file is\n                         provided, it will be ignored.\n  -d, --directory TEXT   Directory whose files you want to bulk-convert to\n                         PDF. If `--inputfile` is also provided, this option\n                         will be ignored. Defaults to None.\n  --help                 Show this message and exit.\n```\n\nAn example usage can be:\n\n```bash\npdfitdown -i README.md -o README.pdf -t \"README\"\n```\n\nOr you can use it **inside your python scripts**:\n\n```python\nfrom pdfitdown.pdfconversion import Converter\n\nconverter = Converter()\nconverter.convert(file_path = \"business_grow.md\", output_path = \"business_growth.pdf\", title=\"Business Growth for Q3 in 2024\")\nconverter.convert(file_path = \"logo.png\", output_path = \"logo.pdf\")\nconverter.convert(file_path = \"users.xlsx\", output_path = \"users.pdf\")\n```\n\nYou can also convert **multiple files at once**:\n\n- In the CLI:\n\n```bash\n# with custom output paths\npdfitdown -i test0.png -i test1.md -o testoutput0.pdf -o testoutput1.pdf\n# with inferred output paths\npdfitdown -i test0.png -i test1.csv\n```\n\n- In the Python API:\n\n```python\nfrom pdfitdown.pdfconversion import Converter\n\nconverter = Converter()\n# with custom output paths\nconverter.multiple_convert(file_paths = [\"business_growth.md\", \"logo.png\"], output_paths = [\"business_growth.pdf\", \"logo.pdf\"])\n# with inferred output paths\nconverter.multiple_convert(file_paths = [\"business_growth.md\", \"logo.png\"])\n```\n\nYou can bulk-convert **all the files in a directory**:\n\n- In the CLI:\n\n```bash\npdfitdown -d tests/data/testdir\n```\n\n- In the Python API:\n\n```python\nfrom pdfitdown.pdfconversion import Converter\n\nconverter = Converter()\noutput_paths = converter.convert_directory(directory_path = \"tests/data/testdir\")\nprint(output_paths)\n```\n\nIn the python API you can also define a **custom callback for the conversion**. In this example, we use Google Gemini to summarize a file and save its content as a PDF:\n\n```python\nfrom pathlib import Path\nfrom pdfitdown.pdfconversion import Converter\nfrom markdown_pdf import MarkdownPdf, Section\nfrom google import genai\n\nclient = genai.Client()\n\ndef conversion_callback(input_file: str, output_file: str, title: str | None = None, overwrite: bool = True)\n    uploaded_file = client.files.upload(file=Path(input_file))\n    response = client.models.generate_content(\n        model=\"gemini-2.0-flash\",\n        contents=[\"Give me a summary of this file.\", uploaded_file],\n    )\n    content = response.text\n    pdf = MarkdownPdf(toc_level=0)\n    pdf.add_section(Section(content))\n    pdf.meta[\"title\"] = title or \"Summary by Gemini\"\n    pdf.save(output_file)\n    return output_fle\n\nconverter = Converter(conversion_callback=conversion_callback)\nconverter.convert(file_path = \"business_growth.md\", output_path = \"business_growth.pdf\", title=\"Business Growth for Q3 in 2024\")\n```\n\nMoreover, the python API provides you with the possibility of mounting PdfItDown conversion features into a backend server built with Starlette and Starlette-compatible frameworks (such as FastAPI):\n\n```python\nfrom starlette.applications import Starlette\nfrom starlette.requests import Request\nfrom startlette.responses import PlainTextResponse\nfrom starlette.routing import Route\nfrom pdfitdown.pdfconversion import Converter\nfrom pdfitdown.server import mount\n\nasync def hello_world(request: Request) -\u003e PlainTextResponse:\n    return PlainTextResponse(content=\"hello world!\")\n\nroutes = Route(\"/helloworld\", hello_world)\napp = Starlette(routes=routes)\n\napp = mount(app, converter=Converter(), path=\"/conversions/pdf\", name=\"pdfitdown\")\n```\n\nNow you can send file payloads to the `/conversions/pdf` endpoint through POST requests and get the content of the converted file back, in the response content:\n\n```python\nimport httpx\n\nwith open(\"file.txt\", \"rb\") as f:\n    content = f.read()\n\nfiles = {\"file_upload\": (\"file.txt\", content, \"text/plain\")}\n\nwith httpx.Client() as client:\n    response = client.post(\"http://localhost:80/conversions/pdf\", files=files)\n\n    assert response.status_code == 200\n    with open(\"file.pdf\", \"wb\") as f:\n        f.write(response.content)\n```\n\n\n### Contributing\n\nContributions are always welcome!\n\nFind contribution guidelines at [CONTRIBUTING.md](https://github.com/AstraBert/PdfItDown/tree/main/CONTRIBUTING.md)\n\n### License and Funding\n\nThis project is open-source and is provided under an [MIT License](https://github.com/AstraBert/PdfItDown/tree/main/LICENSE).\n\nIf you found it useful, please consider [funding it](https://github.com/sponsors/AstraBert).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastrabert%2Fpdfitdown","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fastrabert%2Fpdfitdown","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastrabert%2Fpdfitdown/lists"}