{"id":41619643,"url":"https://github.com/l2ysho/afpp","last_synced_at":"2026-02-01T14:02:00.598Z","repository":{"id":257780595,"uuid":"859272020","full_name":"l2ysho/afpp","owner":"l2ysho","description":"A fast, efficient, and minimal PDF parser for Node.js. Zero bloat. One dependency. Production-ready.","archived":false,"fork":false,"pushed_at":"2026-01-24T11:09:41.000Z","size":2958,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-24T22:39:08.902Z","etag":null,"topics":["pdf","pdfjs","pdfparser","pdftoimage","pdftoimg","pdftotext"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/l2ysho.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-09-18T11:38:52.000Z","updated_at":"2026-01-24T11:09:42.000Z","dependencies_parsed_at":"2024-09-19T18:14:01.063Z","dependency_job_id":"68d3e3a2-801e-43af-bc05-d39c28636b34","html_url":"https://github.com/l2ysho/afpp","commit_stats":null,"previous_names":["l2ysho/afpp"],"tags_count":36,"template":false,"template_full_name":null,"purl":"pkg:github/l2ysho/afpp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/l2ysho%2Fafpp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/l2ysho%2Fafpp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/l2ysho%2Fafpp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/l2ysho%2Fafpp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/l2ysho","download_url":"https://codeload.github.com/l2ysho/afpp/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/l2ysho%2Fafpp/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28980159,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-01T13:38:33.235Z","status":"ssl_error","status_checked_at":"2026-02-01T13:38:32.912Z","response_time":56,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pdf","pdfjs","pdfparser","pdftoimage","pdftoimg","pdftotext"],"created_at":"2026-01-24T13:28:23.305Z","updated_at":"2026-02-01T14:02:00.592Z","avatar_url":"https://github.com/l2ysho.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# afpp\n\n![Version](https://img.shields.io/github/v/release/l2ysho/afpp)\n[![codecov](https://codecov.io/github/l2ysho/afpp/graph/badge.svg?token=2PE32I4M9K)](https://codecov.io/github/l2ysho/afpp)\n![Node](https://img.shields.io/badge/node-%3E%3D%2022.14.0-brightgreen.svg)\n![npm Downloads](https://img.shields.io/npm/dt/afpp.svg)\n![Repo Size](https://img.shields.io/github/repo-size/l2ysho/afpp)\n![Last Commit](https://img.shields.io/github/last-commit/l2ysho/afpp.svg)\n\n\u003e **afpp** — A modern, dependency-light PDF parser for Node.js.\n\u003e\n\u003e Built for performance, reliability, and developer sanity.\n\n---\n\n## Overview\n\n`afpp` (Another PDF Parser, Properly) is a Node.js library for extracting text and images from PDF files without heavyweight native dependencies, event-loop blocking, or fragile runtime assumptions.\n\nThe project was created to address recurring problems encountered with existing PDF tooling in the Node.js ecosystem:\n\n- Excessive bundle sizes and transitive dependencies\n- Native build steps (canvas, ImageMagick, Ghostscript)\n- Browser-specific assumptions (`window`, DOM, canvas)\n- Poor TypeScript support\n- Unreliable handling of encrypted PDFs\n- Performance and memory inefficiencies\n\n`afpp` focuses on **predictable behavior**, **explicit APIs**, and **production-ready defaults**.\n\n---\n\n## Key Features\n\n- Zero native build dependencies\n- Fully asynchronous, non-blocking architecture\n- First-class TypeScript support\n- Supports local files, buffers, and remote URLs\n- Handles encrypted PDFs\n- Configurable concurrency and rendering scale\n- Minimal and auditable dependency graph\n\n---\n\n## Requirements\n\n- **Node.js** \u003e= 22.14.0\n\n---\n\n## Installation\n\nInstall using your preferred package manager:\n\n```bash\nnpm install afpp\n# or\nyarn add afpp\n# or\npnpm add afpp\n```\n\n---\n\n## Quick Start\n\nAll parsing functions accept the same input types:\n\n- `string` (file path)\n- `Buffer`\n- `URL`\n\n### Extract Text from a PDF\n\n```ts\nimport { readFile } from 'fs/promises';\nimport path from 'path';\n\nimport { pdf2string } from 'afpp';\n\n(async () =\u003e {\n  const filePath = path.join('..', 'test', 'example.pdf');\n  const buffer = await readFile(filePath);\n\n  const pages = await pdf2string(buffer);\n  console.log(pages); // ['Page 1 text', 'Page 2 text', ...]\n})();\n```\n\n---\n\n### Render PDF Pages as Images\n\n```ts\nimport { pdf2image } from 'afpp';\n\n(async () =\u003e {\n  const url = new URL('https://pdfobject.com/pdf/sample.pdf');\n  const images = await pdf2image(url);\n\n  console.log(images); // [Buffer, Buffer, ...]\n})();\n```\n\n---\n\n### Streaming API (Large PDFs)\n\nFor large PDFs, use streaming functions to process pages incrementally without loading all results into memory:\n\n```ts\nimport { writeFile } from 'fs/promises';\n\nimport { streamPdf2image, streamPdf2string } from 'afpp';\n\n// Stream images - process each page as it's rendered\nfor await (const { pageNumber, pageCount, data } of streamPdf2image(\n  './large.pdf',\n)) {\n  await writeFile(`page-${pageNumber}.png`, data);\n  console.log(`Processed ${pageNumber}/${pageCount}`);\n}\n\n// Stream text - process each page as it's extracted\nfor await (const { pageNumber, data } of streamPdf2string('./large.pdf')) {\n  console.log(`Page ${pageNumber}: ${data.substring(0, 100)}...`);\n}\n```\n\n**Benefits:**\n\n- Lower peak memory usage\n- Faster time-to-first-result\n- Built-in progress tracking via `pageNumber` and `pageCount`\n\n---\n\n### Low-Level Parsing API\n\nFor advanced use cases, `parsePdf` exposes page-level control and transformation.\n\n```ts\nimport { parsePdf } from 'afpp';\n\n(async () =\u003e {\n  const response = await fetch('https://pdfobject.com/pdf/sample.pdf');\n  const buffer = Buffer.from(await response.arrayBuffer());\n\n  const result = await parsePdf(buffer, {}, (pageContent) =\u003e pageContent);\n  console.log(result);\n})();\n```\n\n---\n\n## Configuration\n\nAll public APIs accept a shared options object.\n\n```ts\nconst result = await parsePdf(buffer, {\n  concurrency: 5,\n  imageEncoding: 'jpeg',\n  password: 'STRONG_PASS',\n  scale: 4,\n});\n```\n\n### AfppParseOptions\n\n| Option          | Type                                  | Default | Description                                   |\n| --------------- | ------------------------------------- | ------- | --------------------------------------------- |\n| `concurrency`   | `number`                              | `1`     | Number of pages processed in parallel         |\n| `imageEncoding` | `'png' \\| 'jpeg' \\| 'webp' \\| 'avif'` | `'png'` | Output format for rendered images             |\n| `password`      | `string`                              | —       | Password for encrypted PDFs                   |\n| `scale`         | `number`                              | `1.0`   | Rendering scale (1.0 = 72 DPI, 2.0 = 144 DPI) |\n\n---\n\n## Design Principles\n\n- **Node-first**: No browser globals or DOM assumptions\n- **Explicit over implicit**: No magic configuration\n- **Fail fast**: Clear errors instead of silent corruption\n- **Production-oriented**: Optimized for long-running processes\n\n---\n\n## License\n\nMIT © Richard Solár\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fl2ysho%2Fafpp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fl2ysho%2Fafpp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fl2ysho%2Fafpp/lists"}