{"id":25900852,"url":"https://github.com/pixelastic/pietro","last_synced_at":"2025-08-04T16:36:42.323Z","repository":{"id":33164839,"uuid":"153744141","full_name":"pixelastic/pietro","owner":"pixelastic","description":"π  Utilities to split PDF files into smaller files, generate thumbnails and extract textual content.","archived":false,"fork":false,"pushed_at":"2025-02-26T13:10:39.000Z","size":3745,"stargazers_count":7,"open_issues_count":3,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-26T14:24:17.144Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pixelastic.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-10-19T07:35:00.000Z","updated_at":"2025-02-05T21:38:20.000Z","dependencies_parsed_at":"2024-02-23T20:24:36.691Z","dependency_job_id":"e4161cc8-57d6-4f04-8e20-1f4810ecc2c7","html_url":"https://github.com/pixelastic/pietro","commit_stats":null,"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pixelastic%2Fpietro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pixelastic%2Fpietro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pixelastic%2Fpietro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pixelastic%2Fpietro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pixelastic","download_url":"https://codeload.github.com/pixelastic/pietro/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241596325,"owners_count":19988060,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-03T02:19:26.739Z","updated_at":"2025-03-03T02:19:27.242Z","avatar_url":"https://github.com/pixelastic.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# π Pietro\n\nSet of utilities to deal with PDF files: splitting into individual pages,\nextracting text and images.\n\nThis modules relies on CLI binaries (qpdf, imagemagick, poppler-utils, etc), so\nit provide good interoperability, it runs every command from inside a Docker\ncontainer.\n\nIt will be slower than other implementation, but it is portable.\n\n_This package has been created out of the need to centralize common pattern of\nPDF handling I needed for various projects. The way it names files and folders\nmakes sense to me, they might not suit you._\n\n```js\nimport { extractPages, extractImages, extractText } from 'pietro';\n\n\n// Extract each page of the pdf into its own file\nawait extractPages('./bigFile.pdf', './pages');\n// ./pages/\n// ./pages/001.pdf\n// ./pages/...\n// ./pages/401.pdf\n\n// Extract the text of a given pdf into a txt file\nawait extractText('./pages/001.pdf', './text/001.txt');\n// ./text/\n// ./text/001.txt\n\n// Extract all images from a given pdf file.\n// ./raw contains ALL images, including presentational ones and masks\n// ./illustrations only contains \"real\" illustration images\nawait extractImages('./pages/001.pdf', ./images');\n// ./images/\n// ./images/001/\n// ./images/001/raw/\n// ./images/001/raw/001.png\n// ./images/001/raw/...\n// ./images/001/raw/227.png\n// ./images/001/illustrations/\n// ./images/001/illustrations/1.png\n// ./images/001/illustrations/...\n// ./images/001/illustrations/5.png\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpixelastic%2Fpietro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpixelastic%2Fpietro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpixelastic%2Fpietro/lists"}