{"id":17125226,"url":"https://github.com/baltpeter/scanprep","last_synced_at":"2025-05-07T04:44:20.131Z","repository":{"id":57464167,"uuid":"328184917","full_name":"baltpeter/scanprep","owner":"baltpeter","description":"Small utility to prepare scanned documents. Supports separating PDF files by separator pages and removing blank pages.","archived":false,"fork":false,"pushed_at":"2024-08-13T00:10:15.000Z","size":524,"stargazers_count":32,"open_issues_count":4,"forks_count":11,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-05-01T00:00:00.203Z","etag":null,"topics":["hacktoberfest","image-processing","pdf","scanned-documents","scanning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/baltpeter.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-01-09T15:26:17.000Z","updated_at":"2024-11-27T18:12:46.000Z","dependencies_parsed_at":"2022-09-05T05:30:38.005Z","dependency_job_id":null,"html_url":"https://github.com/baltpeter/scanprep","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baltpeter%2Fscanprep","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baltpeter%2Fscanprep/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baltpeter%2Fscanprep/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baltpeter%2Fscanprep/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/baltpeter","download_url":"https://codeload.github.com/baltpeter/scanprep/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252231029,"owners_count":21715469,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hacktoberfest","image-processing","pdf","scanned-documents","scanning"],"created_at":"2024-10-14T18:44:29.140Z","updated_at":"2025-05-07T04:44:20.108Z","avatar_url":"https://github.com/baltpeter.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# scanprep – Prepare scanned PDF documents\n\n\u003e Small utility to prepare scanned documents. Supports separating PDF files by separator pages and removing blank pages.\n\n\u003c!-- TODO: GIF showing how to use scanprep --\u003e\n\nScanprep can be used to prepare scanned documents for further processing with existing tools (like the great [OCRmyPDF](https://github.com/jbarlow83/OCRmyPDF)) or directly for archival. It allows splitting multiple documents that were scanned in a single batch into multiple files. In addition, it can also remove blank pages from the output (this is especially helpful if using a duplex scanner).\n\nFor document separation, separator pages need to be inserted between the different documents before scanning. These pages tell the program where to split. You can either use the [included separator page](/separator-page.pdf) or create your own. The separator page simply needs to have a barcode that encodes the text `SCANPREP_SEP` (you can use any [barcode type supported by zbar](http://zbar.sourceforge.net/about.html)).\n\n## Installation\n\n### Via Snap\n\nYou can install scanprep from the [Snap Store](https://snapcraft.io/scanprep):\n\n```sh\nsnap install scanprep\n\nscanprep -h\n```\n\n### Via PyPI\n\nYou can install scanprep using `pip` (consider doing that in a venv):\n\n```sh\npip3 install scanprep\n\n# If you see an error like \"ImportError: Unable to find zbar shared library\", you need to install zbar yourself. See: https://pypi.org/project/pyzbar/\nscanprep -h\n```\n\n### From source\n\nTo install scanprep from source, clone this repository and install the dependencies:\n\n```sh\ngit clone https://github.com/baltpeter/scanprep.git\ncd scanprep\npip3 install -r requirements.txt # You may want to do this in a venv.\n# You may also need to install the zbar shared library. See: https://pypi.org/project/pyzbar/\n\npython3 scanprep/scanprep.py -h\n```\n\n## Usage\n\nMost simply, you can run scanprep via `scanprep \u003cfilename.pdf\u003e`. This will process the input file and output the results into your current working directory. To specify a different output directory, use `scanprep \u003cfilename.pdf\u003e \u003coutput_directory\u003e`.  \nThe output files will be called `0-\u003cfilename.pdf\u003e`, `1-\u003cfilename.pdf\u003e`, and so on.\n\nBy default, both page separation and blank page removal will be performed. To turn them off, use `--no-page-separation` or `--no-blank-removal`, respectively.\n\nUse `scanprep -h` to show the help:\n\n```\nusage: scanprep [-h] [--page-separation] [--blank-removal] input_pdf [output_dir]\n\npositional arguments:\n  input_pdf             The PDF document to process.\n  output_dir            The directory where the output documents will be saved. (defaults to the\n                        current directory)\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --page-separation, --no-page-separation\n                        Do (or do not) split document into separate files by the included\n                        separator pages. (default yes)\n  --blank-removal, --no-blank-removal\n                        Do (or do not) remove empty pages from the output. (default yes)\n```\n\n## License\n\nScanprep is licensed under the MIT license, see the [`LICENSE`](/LICENSE) file for details. Issues and pull requests are welcome!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaltpeter%2Fscanprep","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbaltpeter%2Fscanprep","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaltpeter%2Fscanprep/lists"}