Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ad-si/awesome-scanning

A curated list of awesome projects to simplify and improve paper and document scanning.
https://github.com/ad-si/awesome-scanning

List: awesome-scanning

book-digitization book-scanner book-scanning digitization dms document-scanner page-scanning scanned-documents scanner scanning

Last synced: about 2 months ago
JSON representation

A curated list of awesome projects to simplify and improve paper and document scanning.

Awesome Lists containing this project

README

        

# Awesome Scanning

A curated list of awesome projects to simplify and improve paper scanning.

> [!TIP]
> Sponsored by: \
> **Perspec - Desktop app to correct the perspective of images.** \
> 🌐 [Get Perspec](https://feram.gumroad.com/l/perspec) \
> 🖥️ [github.com/ad-si/Perspec](https://github.com/ad-si/Perspec)

---

**Table Of Contents**

- [Websites](#websites)
- [Apps](#apps)
- [Desktop](#desktop)
- [Cross Platform](#cross-platform)
- [MacOS](#macos)
- [Mobile](#mobile)
- [iOS](#ios)
- [Android](#android)
- [Posts](#posts)
- [Discussions](#discussions)
- [Software Libraries](#software-libraries)
- [Document Management](#document-management)
- [Research](#research)
- [Ishikawa Watanabe Laboratory - High-speed digital archiving](#ishikawa-watanabe-laboratory---high-speed-digital-archiving)
- [Devices](#devices)

## Websites

- [DIY Book Scanner] - Community of people who build book scanners.
- [Docutain] - SDK for document & barcode scanning and data capturing.
- [Eagle Doc] - Invoice and receipt Recognition as a service.

[DIY Book Scanner]: https://diybookscanner.org
[Docutain]: https://docutain.com
[Eagle Doc]: https://www.eagle-doc.com

## Apps

### Desktop

#### Cross Platform

- [Book Scanning] - Book scanner software for home-made scanner (no license).
- [BookDrive Editor Pro] -
Software for post-processing images of books (commercial).
- [Voussoir] - Single-camera solution for book scanning (open source).
- [Booksorber] - Processes camera images of book pages (commercial).
- [Decapod] - Web application frontend for image processing and capture tools.
- [DxO Viewpoint] - Correct perspective distortions in images (commercial).
- [Easy Scan] - Scanning software for book2net scanners (commercial).
- [LIMB] -
Project inventory, image processing, quality control, OCR,
document structuring and multiple format exporting
for long-term archiving (commercial).
- [Nidaba] - Expandable and scalable OCR pipeline.
- [OpenCV-Document-Scanner] -
Interactive document scanner built with Python and OpenCV.
- [Page Improver] - Automatic image enhancing software for page scanning.
- [Perspec] - Manually correct the perspective of images.
- [Readiris 17] - OCR software to digitalize papers, images, or PDF files.
- [ScanGate LWF] - Stand-alone software for book digitization (commercial).
- [ScanTailor] -
Interactive post-processing tool for scanned pages (open source).
- [ScanTailor Advanced] -
Merges features of forks, adds new features, and includes fixes.
- [Skarynka] - Software to scan and process images to build books.
- [YASW] - Yet Another Scan Wizard (open source).
- [scanner] - Document scanner for the web built in Rust.

[Book Scanning]: https://github.com/Canta/book-scanning
[BookDrive Editor Pro]: https://atiz.com/bookdrive-editor-pro/
[Voussoir]: https://github.com/jglev/voussoir
[Booksorber]: http://booksorber.com
[Decapod]: https://github.com/Decapod/decapod
[DxO Viewpoint]: https://www.dxo.com/dxo-viewpoint/
[Easy Scan]: https://book2net.net/en/2021/06/30/easy-scan/
[LIMB]: https://www.limbsuite.com
[Nidaba]: https://github.com/openphilology/nidaba
[OpenCV-Document-Scanner]: https://github.com/andrewdcampbell/OpenCV-Document-Scanner
[Page Improver]: http://4digitalbooks.com/_soft_imaget.php
[Perspec]: https://github.com/ad-si/Perspec
[Readiris 17]: https://iriscorporate.com/softwares/readiris-17/
[ScanGate LWF]: https://www.treventus.com/software/image-processing-automation
[ScanTailor Advanced]: https://github.com/4lex4/scantailor-advanced
[ScanTailor]: https://scantailor.org
[Skarynka]: https://github.com/alex73/Skarynka
[YASW]: https://sourceforge.net/projects/yascanw/
[scanner]: https://github.com/101arrowz/scanner

#### MacOS

- [Plumb-Bob] - Perspective rectifier (macOS app).
- [Prizmo] - Turn photos into scans by adjusting perspective, cropping, etc. (macOS app).

[Plumb-Bob]: https://fitplot.it/plumb-bob/
[Prizmo]: https://creaceed.com/prizmo

### Mobile

#### iOS

- [CamScanner] - Scan any kind of document.
- [Doc OCR] - PDF scanner with document image dewarping.
- [Doc Scan] - Turn your iPhone / iPad into a portable scanner and PDF editor.
- [Genius Scan] - A scanner in your pocket.
- [IRIScan] - Scan documents with your iPhone or iPad.
Trims, enhances and makes pictures of whiteboards and docs readable.
- [Quick Scan] - Scan, Recognize, Automate.
- [Scanbot] - High quality scans with one tap.
- [Scannable] - Scan contracts, receipts and business cards.
- [Scanner Pro] - Scan paper documents into PDFs.
- [vFlat] - Capture documents, forms, receipts, books and convert them into high-quality PDFs.

[CamScanner]: https://www.camscanner.com/
[Doc OCR]: https://ifunplay.com/dococr.html
[Doc Scan]: https://ifunplay.com/docscan.html
[Genius Scan]: https://thegrizzlylabs.com/genius-scan
[IRIScan]: https://www.irislink.com/EN-ROW/c1102/IRIScan-iOS---OCR-App-for-iOS.aspx
[Quick Scan]: https://www.quickscanapp.com/
[Scanbot]: https://scanbot.io
[Scannable]: https://apps.apple.com/us/app/evernote-scannable/id883338188
[Scanner Pro]: https://readdle.com/scannerpro
[vFlat]: https://www.vflat.com/en

### Android

- [Adobe Scan] - Scan, OCR, and edit documents (account required).
- [Google Drive] - Use the camera to scan documents (does not support loading existing photos).
- [Microsoft Lens] - Trim, enhances, and make photos of whiteboards and documents readable.
- [Open Camera] - Extensive open source camera app.
- [PDF-Doc-Scan] - Open source Android PDF document scanning app.
- [Stack] - PDF scanner, document organizer, and detail finder by Google's Area 120.

[Adobe Scan]: https://play.google.com/store/apps/details?id=com.adobe.scan.android
[Google Drive]: https://support.google.com/drive/answer/3145835?co=GENIE.Platform%3DAndroid
[Microsoft Lens]: https://play.google.com/store/apps/details?id=com.microsoft.office.officelens
[Open Camera]: https://opencamera.org.uk
[PDF-Doc-Scan]: https://github.com/LittleTrickster/PDF-Doc-Scan
[Stack]: https://play.google.com/store/apps/details?id=com.area120.paperwork

## Posts

- [Dewarping pages]
- [Document scanner] -
How to Build a Kick-Ass Mobile Document Scanner in Just 5 Minutes.
- [Genetic programming in the cloud]
- [Keypoint Detection with Transfer Learning][keypoint]
- [math.stackexchange] -
Compute ratio of a rectangle seen from an unknown perspective.
- [Noteshrink] - Compressing and enhancing hand-written notes.
- [Page dewarping] - Flattening images of curled pages.
- [Perspective transform] - 4 Point OpenCV getPerspective Transform Example.
- [Stackoverflow] - Proportions of a perspective-deformed rectangle.
- [Unpaper] - Post-processing tool for scanned sheets of paper.
- [pdfsandwich] - CLI tool using OCR to add text to image PDFs.
- [rbgg] - Remove background from images of paper.
- [Unprojecting text with ellipses] -
Using transformed ellipses to estimate perspective transformations of text.
- [Document-Dewarping with Control-Points] - Dewarping of document images using control points.
- [Building an image processing pipeline with Python]

[Dewarping pages]: https://halfbakedmaker.org/blog/366
[Document scanner]: https://pyimagesearch.com/2014/09/01/build-kick-ass-mobile-document-scanner-just-5-minutes/
[Genetic programming in the cloud]: https://halfbakedmaker.org/blog/382
[keypoint]: https://keras.io/examples/vision/keypoint_detection/
[math.stackexchange]: https://math.stackexchange.com/questions/1339924/compute-ratio-of-a-rectangle-seen-from-an-unknown-perspective
[Noteshrink]: https://mzucker.github.io/2016/09/20/noteshrink.html
[Page dewarping]: https://mzucker.github.io/2016/08/15/page-dewarping.html
[Perspective transform]: https://pyimagesearch.com/2014/08/25/4-point-opencv-getperspective-transform-example/
[Stackoverflow]: https://stackoverflow.com/questions/1194352/proportions-of-a-perspective-deformed-rectangle
[Unpaper]: https://github.com/Flameeyes/unpaper
[pdfsandwich]: http://www.tobias-elze.de/pdfsandwich/
[rbgg]: https://github.com/fogleman/rbgg
[Unprojecting text with ellipses]: https://mzucker.github.io/2016/10/11/unprojecting-text-with-ellipses.html
[Document-Dewarping with Control-Points]: https://github.com/gwxie/Document-Dewarping-with-Control-Points
[Building an image processing pipeline with Python]:
https://pyvideo.org/pycon-us-2013/building-an-image-processing-pipeline-with-python.html

## Discussions

- [Methods To Sense The 3D Surface/Structure Of A Book](
https://diybookscanner.org/forum/viewtopic.php?f=17&t=788)
- [Robust Reading Competition] - Detection and recognition challenges for text in scene images.
- [What are the most common algorithms for adaptive thresholding?][alg-thresh]

[alg-thresh]: https://dsp.stackexchange.com/questions/2411/what-are-the-most-common-algorithms-for-adaptive-thresholding
[Robust Reading Competition]: https://rrc.cvc.uab.es

## Software Libraries

- [CornerAPI] - Detect torn corners and edges in document images.
- [doc2text] - Bulk detect text blocks and OCR scanned PDFs.
- [Empty_training] - Train neural network to detect empty pages in document images.
- [EmptyAPI] - Detect empty pages in document images.
- [FaultyImageAPI] - Combines [CornerAPI], [EmptyAPI], [PostitAPI], and [WritingtypeAPI]
- [imgwarp-js] - Warp images using JavaScript.
- [Laser Book Scanning] - Experimental methods for dewarping document images based on the use of lasers.
- [LCNN] - End-to-End Wireframe Parsing.
- [Pixelnetica] - Document Scanning SDK for business apps.
- [PostitAPI] - Detect post-it/sticky notes in document images.
- [PyThreshold] - Implementations of state-of-the-art image thresholding algorithms.
- [Segment Anything] - AI model that can cut out any object in any image.
- [Table_segmentation] - Segment table structures and detect text content in document images.
- [Train_document_classification] - Train a neural network to classify input documents based on the type/format.
- [Train_fault_detection] - Train a neural network to detect faults (e.g. folded corners, sticky notes, …) in document images.
- [Train_writing_type] - Train a neural network to classify document images by writing type (handwritten, typewritten).
- [WeScan] - Library to add scanning functionalities to an iOS app.
- [WritingtypeAPI] - Classify document based on the writing type (handwritten, typewritten).

[CornerAPI]: https://github.com/DALAI-project/CornerAPI
[doc2text]: https://github.com/jlsutherland/doc2text
[Empty_training]: https://github.com/DALAI-project/Empty_training
[EmptyAPI]: https://github.com/DALAI-project/EmptyAPI
[FaultyImageAPI]: https://github.com/DALAI-project/FaultyImageAPI
[imgwarp-js]: https://github.com/cxcxcxcx/imgwarp-js
[Laser Book Scanning]: https://github.com/duerig/laser-dewarp
[LCNN]: https://github.com/zhou13/lcnn
[Pixelnetica]: https://www.pixelnetica.com/
[PostitAPI]: https://github.com/DALAI-project/PostitAPI
[PyThreshold]: https://github.com/manuelaguadomtz/pythreshold
[Segment Anything]: https://github.com/facebookresearch/segment-anything
[Table_segmentation]: https://github.com/DALAI-project/Table_segmentation
[Train_fault_detection]: https://github.com/DALAI-project/Train_fault_detection
[Train_writing_type]: https://github.com/DALAI-project/Train_writing_type
[WeScan]: https://github.com/WeTransfer/WeScan
[WritingtypeAPI]: https://github.com/DALAI-project/WritingtypeAPI

## Document Management

- [Docspell] - Document management system for private and small business use.
- [Hermes] - Open source document management system by HashiCorp.
- [Mayan EDMS] - Libre document management system.
- [OpenPaper.work] - Scan and import personal documents.
- [Paperless NGX] - Scan, index, and archive paper documents.
- [Papermerge] - Open source document management system for digital archives.
- [Polar] - Knowledge manager for web pages, textbooks, and PDFs.
- [TagSpaces] - Offline & open source document manager with tagging support.
- [Teedy] - Lightweight document management system for individuals & businesses.

[Docspell]: https://github.com/eikek/docspell
[Hermes]: https://github.com/hashicorp-forge/hermes
[Mayan EDMS]: https://github.com/mayan-edms/Mayan-EDMS
[OpenPaper.work]: https://www.openpaper.work/en/
[Paperless NGX]: https://github.com/paperless-ngx/paperless-ngx
[Paperless]: https://github.com/the-paperless-project/paperless
[Papermerge]: https://github.com/ciur/papermerge
[Polar]: https://getpolarized.io
[TagSpaces]: https://github.com/tagspaces/tagspaces
[Teedy]: https://github.com/sismics/docs

## Research

- [Whiteboard scanning] -
Whiteboard scanning and image enhancement by Zhengyou Zhang , Li-Wei He
- [Cam params] -
Determining camera parameters from the perspective projection of a rectangle
by Robert M. Haralick. (PDF)
- [Dewarping of document images] -
Two-Step Dewarping of Camera Document Images
by N. Stamatopoulos, B. Gatos, I. Pratikakis & S. J. Perantonis
- [Dewarping of Document Images using Coupled-Snakes]
- [DewarpNet: Single-Image Document Unwarping
With Stacked 3D and 2D Regression Networks][DewarpNet]
- [Image and Depth from a Conventional Camera with a Coded Aperture]
- [The IUPR Dataset of Camera-Captured Document Images][iupr-dataset]
- [Image processing via level set curvature flow]
- [OCR Datasets] - Collection of OCR-related datasets.

[Whiteboard scanning]: https://www.microsoft.com/en-us/research/publication/whiteboard-scanning-image-enhancement/
[Cam params]: https://www.haralick.org/journals/determining_camera_parameters_rectangle.pdf
[Dewarping of document images]: https://users.iit.demokritos.gr/~bgat/3337a209.pdf
[Dewarping of Document Images using Coupled-Snakes]: https://www.semanticscholar.org/paper/Dewarping-of-Document-Images-using-Coupled-Snakes-Bukhari-Shafait/3865964b607a1ecfb0979b0fb30c5aec4a2cfcf2
[DewarpNet]: https://github.com/cvlab-stonybrook/DewarpNet
[Image and Depth from a Conventional Camera with a Coded Aperture]:
https://groups.csail.mit.edu/graphics/CodedAperture/
[iupr-dataset]:
https://www.dfki.de/en/web/research/projects-and-publications/publication/5681/
[Image processing via level set curvature flow]:
https://math.berkeley.edu/~sethian/2006/Papers/sethian.imageprocessinglevelset.pnas.pdf
[OCR Datasets]: https://github.com/xinke-wang/OCRDatasets

### Ishikawa Watanabe Laboratory - High-speed digital archiving

[Vision architecture overview]

- [Book Flipping Scanning]
- [BFS-Auto: High Speed & High Definition Book Scanner]
- [Real-time 3D Page Tracking and Book Status Recognition]
- [High-speed and High-definition Document Digitalization System based on Adaptive Scanning using Real-time 3D Sensing]
- [Automatic page turner machine for Book Flipping Scanning]
- [Document Digitization and its Quality Improvement using a Multi-camera Array]
- [Digitization of Deformed Documents using a High-speed Multi-camera Array]
- [BFS-Solo: High Speed Book Digitization using Monocular Video]
- [Reconstruction of 3D Surface and Restoration of Flat document Image from Monocular Image Sequence]
- [High-accuracy rectification technique of deformed document image using Tiled Rectangle Fragments (TRFs)]
- [Document Image Rectification using Advance Knowledge of 3D Deformation]
- [Estimation of Non-rigid Surface Deformation using Developable Surface Model]
- [Proof-of-concept prototype for Book Flipping Scanning]

[Automatic page turner machine for Book Flipping Scanning]: https://ishikawa-vision.org/vision/BFSflipper/index-e.html
[BFS-Auto: High Speed & High Definition Book Scanner]: https://ishikawa-vision.org/vision/BFS-Auto/index-e.html
[BFS-Solo: High Speed Book Digitization using Monocular Video]: https://ishikawa-vision.org/vision/BFS-Solo/index-e.html
[Book Flipping Scanning]: https://ishikawa-vision.org/vision/BFS/index-e.html
[Digitization of Deformed Documents using a High-speed Multi-camera Array]: https://ishikawa-vision.org/vision/MultiBFS/index-e.html
[Document Digitization and its Quality Improvement using a Multi-camera Array]: https://ishikawa-vision.org/vision/MultiBFS_boundary/index-e.html
[Document Image Rectification using Advance Knowledge of 3D Deformation]: https://ishikawa-vision.org/vision/BFS_learn/index-e.html
[Estimation of Non-rigid Surface Deformation using Developable Surface Model]: https://ishikawa-vision.org/vision/developable/index-e.html
[High-accuracy rectification technique of deformed document image using Tiled Rectangle Fragments (TRFs)]: https://ishikawa-vision.org/vision/TRF/index-e.html
[High-speed and High-definition Document Digitalization System based on Adaptive Scanning using Real-time 3D Sensing]: https://ishikawa-vision.org/vision/HybridBFS/index-e.html
[Proof-of-concept prototype for Book Flipping Scanning]: https://ishikawa-vision.org/vision/BookFlipScan/index-e.html
[Real-time 3D Page Tracking and Book Status Recognition]: https://ishikawa-vision.org/vision/BFSPageTracking/index-e.html
[Reconstruction of 3D Surface and Restoration of Flat document Image from Monocular Image Sequence]: https://ishikawa-vision.org/vision/MonoBFS/index-e.html
[Vision architecture overview]: https://ishikawa-vision.org/vision/index-e.html

## Devices

- [Archivist] - V-shaped platner based book scanner (open source).
- [book2net] - Book scanners for libraries and archives (commercial).
- [Linear Book Scanner] - Low-cost page-turning book scanner (open source).
- [Portable Scanners] - Several portabal scanning devices (commercial).

[Archivist]: https://diybookscanner.org/archivist/
[book2net]: https://book2net.net/en/products/
[Linear Book Scanner]: https://linearbookscanner.org
[Portable Scanners]: https://www.irislink.com/EN-ROW/c1080/IRIScan---Portable-scanners---Discover-our-range.aspx