Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/ad-si/awesome-scanning

A curated list of awesome projects to simplify and improve paper and document scanning.
https://github.com/ad-si/awesome-scanning
List: awesome-scanning
book-digitization book-scanner book-scanning digitization dms document-scanner page-scanning scanned-documents scanner scanning
Last synced: 15 days ago
JSON representation
A curated list of awesome projects to simplify and improve paper and document scanning.
Host: GitHub
URL: https://github.com/ad-si/awesome-scanning
Owner: ad-si
License: isc
Created: 2016-05-22T08:43:52.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2024-04-19T19:04:03.000Z (7 months ago)
Last Synced: 2024-05-02T05:41:04.443Z (7 months ago)
Topics: book-digitization, book-scanner, book-scanning, digitization, dms, document-scanner, page-scanning, scanned-documents, scanner, scanning
Homepage:
Size: 83 KB
Stars: 347
Watchers: 13
Forks: 21
Open Issues: 0
Metadata Files:
- Readme: readme.md
- License: license
Awesome Lists containing this project

README

        # Awesome Scanning

A curated list of awesome projects to simplify and improve paper scanning.

> [!TIP]

> ^{Sponsored by:} \

> **Perspec - Desktop app to correct the perspective of images.** \

> _{🌐 [Get Perspec](https://feram.gumroad.com/l/perspec)} \

> _{🖥️ [github.com/ad-si/Perspec](https://github.com/ad-si/Perspec)}

---

**Table Of Contents**

- [Websites](#websites)

- [Apps](#apps)

  - [Desktop](#desktop)

    - [Cross Platform](#cross-platform)

    - [MacOS](#macos)

  - [Mobile](#mobile)

    - [iOS](#ios)

  - [Android](#android)

- [Posts](#posts)

- [Discussions](#discussions)

- [Software Libraries](#software-libraries)

- [Document Management](#document-management)

- [Research](#research)

  - [Ishikawa Watanabe Laboratory - High-speed digital archiving](#ishikawa-watanabe-laboratory---high-speed-digital-archiving)

- [Devices](#devices)

## Websites

- [DIY Book Scanner] - Community of people who build book scanners.

- [Docutain] - SDK for document & barcode scanning and data capturing.

- [Eagle Doc] - Invoice and receipt Recognition as a service.

[DIY Book Scanner]: https://diybookscanner.org

[Docutain]: https://docutain.com

[Eagle Doc]: https://www.eagle-doc.com

## Apps

### Desktop

#### Cross Platform

- [Book Scanning] - Book scanner software for home-made scanner (no license).

- [BookDrive Editor Pro] -

    Software for post-processing images of books (commercial).

- [Voussoir] - Single-camera solution for book scanning (open source).

- [Booksorber] - Processes camera images of book pages (commercial).

- [Decapod] - Web application frontend for image processing and capture tools.

- [DxO Viewpoint] - Correct perspective distortions in images (commercial).

- [Easy Scan] - Scanning software for book2net scanners (commercial).

- [LIMB] -

    Project inventory, image processing, quality control, OCR,

    document structuring and multiple format exporting

    for long-term archiving (commercial).

- [Nidaba] - Expandable and scalable OCR pipeline.

- [OpenCV-Document-Scanner] -

    Interactive document scanner built with Python and OpenCV.

- [Page Improver] - Automatic image enhancing software for page scanning.

- [Perspec] - Manually correct the perspective of images.

- [Readiris 17] - OCR software to digitalize papers, images, or PDF files.

- [ScanGate LWF] - Stand-alone software for book digitization (commercial).

- [ScanTailor] -

    Interactive post-processing tool for scanned pages (open source).

- [ScanTailor Advanced] -

    Merges features of forks, adds new features, and includes fixes.

- [Skarynka] - Software to scan and process images to build books.

- [YASW] - Yet Another Scan Wizard (open source).

- [scanner] - Document scanner for the web built in Rust.

[Book Scanning]: https://github.com/Canta/book-scanning

[BookDrive Editor Pro]: https://atiz.com/bookdrive-editor-pro/

[Voussoir]: https://github.com/jglev/voussoir

[Booksorber]: http://booksorber.com

[Decapod]: https://github.com/Decapod/decapod

[DxO Viewpoint]: https://www.dxo.com/dxo-viewpoint/

[Easy Scan]: https://book2net.net/en/2021/06/30/easy-scan/

[LIMB]: https://www.limbsuite.com

[Nidaba]: https://github.com/openphilology/nidaba

[OpenCV-Document-Scanner]: https://github.com/andrewdcampbell/OpenCV-Document-Scanner

[Page Improver]: http://4digitalbooks.com/_soft_imaget.php

[Perspec]: https://github.com/ad-si/Perspec

[Readiris 17]: https://iriscorporate.com/softwares/readiris-17/

[ScanGate LWF]: https://www.treventus.com/software/image-processing-automation

[ScanTailor Advanced]: https://github.com/4lex4/scantailor-advanced

[ScanTailor]: https://scantailor.org

[Skarynka]: https://github.com/alex73/Skarynka

[YASW]: https://sourceforge.net/projects/yascanw/

[scanner]: https://github.com/101arrowz/scanner

#### MacOS

- [Plumb-Bob] - Perspective rectifier (macOS app).

- [Prizmo] - Turn photos into scans by adjusting perspective, cropping, etc. (macOS app).

[Plumb-Bob]: https://fitplot.it/plumb-bob/

[Prizmo]: https://creaceed.com/prizmo

### Mobile

#### iOS

- [CamScanner] - Scan any kind of document.

- [Doc OCR] - PDF scanner with document image dewarping.

- [Doc Scan] - Turn your iPhone / iPad into a portable scanner and PDF editor.

- [Genius Scan] - A scanner in your pocket.

- [IRIScan] - Scan documents with your iPhone or iPad.

    Trims, enhances and makes pictures of whiteboards and docs readable.

- [Quick Scan] - Scan, Recognize, Automate.

- [Scanbot] - High quality scans with one tap.

- [Scannable] - Scan contracts, receipts and business cards.

- [Scanner Pro] - Scan paper documents into PDFs.

- [vFlat] - Capture documents, forms, receipts, books and convert them into high-quality PDFs.

[CamScanner]: https://www.camscanner.com/

[Doc OCR]: https://ifunplay.com/dococr.html

[Doc Scan]: https://ifunplay.com/docscan.html

[Genius Scan]: https://thegrizzlylabs.com/genius-scan

[IRIScan]: https://www.irislink.com/EN-ROW/c1102/IRIScan-iOS---OCR-App-for-iOS.aspx

[Quick Scan]: https://www.quickscanapp.com/

[Scanbot]: https://scanbot.io

[Scannable]: https://apps.apple.com/us/app/evernote-scannable/id883338188

[Scanner Pro]: https://readdle.com/scannerpro

[vFlat]: https://www.vflat.com/en

### Android

- [Adobe Scan] - Scan, OCR, and edit documents (account required).

- [Google Drive] - Use the camera to scan documents (does not support loading existing photos).

- [Microsoft Lens] - Trim, enhances, and make photos of whiteboards and documents readable.

- [Open Camera] - Extensive open source camera app.

- [PDF-Doc-Scan] - Open source Android PDF document scanning app.

- [Stack] - PDF scanner, document organizer, and detail finder by Google's Area 120.

[Adobe Scan]: https://play.google.com/store/apps/details?id=com.adobe.scan.android

[Google Drive]: https://support.google.com/drive/answer/3145835?co=GENIE.Platform%3DAndroid

[Microsoft Lens]: https://play.google.com/store/apps/details?id=com.microsoft.office.officelens

[Open Camera]: https://opencamera.org.uk

[PDF-Doc-Scan]: https://github.com/LittleTrickster/PDF-Doc-Scan

[Stack]: https://play.google.com/store/apps/details?id=com.area120.paperwork

## Posts

- [Dewarping pages]

- [Document scanner] -

    How to Build a Kick-Ass Mobile Document Scanner in Just 5 Minutes.

- [Genetic programming in the cloud]

- [Keypoint Detection with Transfer Learning][keypoint]

- [math.stackexchange] -

    Compute ratio of a rectangle seen from an unknown perspective.

- [Noteshrink] - Compressing and enhancing hand-written notes.

- [Page dewarping] - Flattening images of curled pages.

- [Perspective transform] - 4 Point OpenCV getPerspective Transform Example.

- [Stackoverflow] - Proportions of a perspective-deformed rectangle.

- [Unpaper] - Post-processing tool for scanned sheets of paper.

- [pdfsandwich] - CLI tool using OCR to add text to image PDFs.

- [rbgg] - Remove background from images of paper.

- [Unprojecting text with ellipses] -

    Using transformed ellipses to estimate perspective transformations of text.

- [Document-Dewarping with Control-Points] - Dewarping of document images using control points.

- [Building an image processing pipeline with Python]

[Dewarping pages]: https://halfbakedmaker.org/blog/366

[Document scanner]: https://pyimagesearch.com/2014/09/01/build-kick-ass-mobile-document-scanner-just-5-minutes/

[Genetic programming in the cloud]: https://halfbakedmaker.org/blog/382

[keypoint]: https://keras.io/examples/vision/keypoint_detection/

[math.stackexchange]: https://math.stackexchange.com/questions/1339924/compute-ratio-of-a-rectangle-seen-from-an-unknown-perspective

[Noteshrink]: https://mzucker.github.io/2016/09/20/noteshrink.html

[Page dewarping]: https://mzucker.github.io/2016/08/15/page-dewarping.html

[Perspective transform]: https://pyimagesearch.com/2014/08/25/4-point-opencv-getperspective-transform-example/

[Stackoverflow]: https://stackoverflow.com/questions/1194352/proportions-of-a-perspective-deformed-rectangle

[Unpaper]: https://github.com/Flameeyes/unpaper

[pdfsandwich]: http://www.tobias-elze.de/pdfsandwich/

[rbgg]: https://github.com/fogleman/rbgg

[Unprojecting text with ellipses]: https://mzucker.github.io/2016/10/11/unprojecting-text-with-ellipses.html

[Document-Dewarping with Control-Points]: https://github.com/gwxie/Document-Dewarping-with-Control-Points

[Building an image processing pipeline with Python]:

  https://pyvideo.org/pycon-us-2013/building-an-image-processing-pipeline-with-python.html

## Discussions

- [Methods To Sense The 3D Surface/Structure Of A Book](

    https://diybookscanner.org/forum/viewtopic.php?f=17&t=788)

- [Robust Reading Competition] - Detection and recognition challenges for text in scene images.

- [What are the most common algorithms for adaptive thresholding?][alg-thresh]

[alg-thresh]: https://dsp.stackexchange.com/questions/2411/what-are-the-most-common-algorithms-for-adaptive-thresholding

[Robust Reading Competition]: https://rrc.cvc.uab.es

## Software Libraries

- [CornerAPI] - Detect torn corners and edges in document images.

- [doc2text] - Bulk detect text blocks and OCR scanned PDFs.

- [Empty_training] - Train neural network to detect empty pages in document images.

- [EmptyAPI] - Detect empty pages in document images.

- [FaultyImageAPI] - Combines [CornerAPI], [EmptyAPI], [PostitAPI], and  [WritingtypeAPI]

- [imgwarp-js] - Warp images using JavaScript.

- [Laser Book Scanning] - Experimental methods for dewarping document images based on the use of lasers.

- [LCNN] - End-to-End Wireframe Parsing.

- [Pixelnetica] - Document Scanning SDK for business apps.

- [PostitAPI] - Detect post-it/sticky notes in document images.

- [PyThreshold] - Implementations of state-of-the-art image thresholding algorithms.

- [Segment Anything] - AI model that can cut out any object in any image.

- [Table_segmentation] - Segment table structures and detect text content in document images.

- [Train_document_classification] - Train a neural network to classify input documents based on the type/format.

- [Train_fault_detection] - Train a neural network to detect faults (e.g. folded corners, sticky notes, …) in document images.

- [Train_writing_type] - Train a neural network to classify document images by writing type (handwritten, typewritten).

- [WeScan] - Library to add scanning functionalities to an iOS app.

- [WritingtypeAPI] - Classify document based on the writing type (handwritten, typewritten).

[CornerAPI]: https://github.com/DALAI-project/CornerAPI

[doc2text]: https://github.com/jlsutherland/doc2text

[Empty_training]: https://github.com/DALAI-project/Empty_training

[EmptyAPI]: https://github.com/DALAI-project/EmptyAPI

[FaultyImageAPI]: https://github.com/DALAI-project/FaultyImageAPI

[imgwarp-js]: https://github.com/cxcxcxcx/imgwarp-js

[Laser Book Scanning]: https://github.com/duerig/laser-dewarp

[LCNN]: https://github.com/zhou13/lcnn

[Pixelnetica]: https://www.pixelnetica.com/

[PostitAPI]: https://github.com/DALAI-project/PostitAPI

[PyThreshold]: https://github.com/manuelaguadomtz/pythreshold

[Segment Anything]: https://github.com/facebookresearch/segment-anything

[Table_segmentation]: https://github.com/DALAI-project/Table_segmentation

[Train_fault_detection]: https://github.com/DALAI-project/Train_fault_detection

[Train_writing_type]: https://github.com/DALAI-project/Train_writing_type

[WeScan]: https://github.com/WeTransfer/WeScan

[WritingtypeAPI]: https://github.com/DALAI-project/WritingtypeAPI

## Document Management

- [Docspell] - Document management system for private and small business use.

- [Hermes] - Open source document management system by HashiCorp.

- [Mayan EDMS] - Libre document management system.

- [OpenPaper.work] - Scan and import personal documents.

- [Paperless NGX] - Scan, index, and archive paper documents.

- [Papermerge] - Open source document management system for digital archives.

- [Polar] - Knowledge manager for web pages, textbooks, and PDFs.

- [TagSpaces] - Offline & open source document manager with tagging support.

- [Teedy] - Lightweight document management system for individuals & businesses.

[Docspell]: https://github.com/eikek/docspell

[Hermes]: https://github.com/hashicorp-forge/hermes

[Mayan EDMS]: https://github.com/mayan-edms/Mayan-EDMS

[OpenPaper.work]: https://www.openpaper.work/en/

[Paperless NGX]: https://github.com/paperless-ngx/paperless-ngx

[Paperless]: https://github.com/the-paperless-project/paperless

[Papermerge]: https://github.com/ciur/papermerge

[Polar]: https://getpolarized.io

[TagSpaces]: https://github.com/tagspaces/tagspaces

[Teedy]: https://github.com/sismics/docs

## Research

- [Whiteboard scanning] -

    Whiteboard scanning and image enhancement by Zhengyou Zhang , Li-Wei He

- [Cam params] -

    Determining camera parameters from the perspective projection of a rectangle

    by Robert M. Haralick. (PDF)

- [Dewarping of document images] -

    Two-Step Dewarping of Camera Document Images

    by N. Stamatopoulos, B. Gatos, I. Pratikakis & S. J. Perantonis

- [Dewarping of Document Images using Coupled-Snakes]

- [DewarpNet: Single-Image Document Unwarping

    With Stacked 3D and 2D Regression Networks][DewarpNet]

- [Image and Depth from a Conventional Camera with a Coded Aperture]

- [The IUPR Dataset of Camera-Captured Document Images][iupr-dataset]

- [Image processing via level set curvature flow]

- [OCR Datasets] - Collection of OCR-related datasets.

[Whiteboard scanning]: https://www.microsoft.com/en-us/research/publication/whiteboard-scanning-image-enhancement/

[Cam params]: https://www.haralick.org/journals/determining_camera_parameters_rectangle.pdf

[Dewarping of document images]: https://users.iit.demokritos.gr/~bgat/3337a209.pdf

[Dewarping of Document Images using Coupled-Snakes]: https://www.semanticscholar.org/paper/Dewarping-of-Document-Images-using-Coupled-Snakes-Bukhari-Shafait/3865964b607a1ecfb0979b0fb30c5aec4a2cfcf2

[DewarpNet]: https://github.com/cvlab-stonybrook/DewarpNet

[Image and Depth from a Conventional Camera with a Coded Aperture]:

  https://groups.csail.mit.edu/graphics/CodedAperture/

[iupr-dataset]:

  https://www.dfki.de/en/web/research/projects-and-publications/publication/5681/

[Image processing via level set curvature flow]:

    https://math.berkeley.edu/~sethian/2006/Papers/sethian.imageprocessinglevelset.pnas.pdf

[OCR Datasets]: https://github.com/xinke-wang/OCRDatasets

### Ishikawa Watanabe Laboratory - High-speed digital archiving

[Vision architecture overview]

- [Book Flipping Scanning]

- [BFS-Auto: High Speed & High Definition Book Scanner]

- [Real-time 3D Page Tracking and Book Status Recognition]

- [High-speed and High-definition Document Digitalization System based on Adaptive Scanning using Real-time 3D Sensing]

- [Automatic page turner machine for Book Flipping Scanning]

- [Document Digitization and its Quality Improvement using a Multi-camera Array]

- [Digitization of Deformed Documents using a High-speed Multi-camera Array]

- [BFS-Solo: High Speed Book Digitization using Monocular Video]

- [Reconstruction of 3D Surface and Restoration of Flat document Image from Monocular Image Sequence]

- [High-accuracy rectification technique of deformed document image using Tiled Rectangle Fragments (TRFs)]

- [Document Image Rectification using Advance Knowledge of 3D Deformation]

- [Estimation of Non-rigid Surface Deformation using Developable Surface Model]

- [Proof-of-concept prototype for Book Flipping Scanning]

[Automatic page turner machine for Book Flipping Scanning]: https://ishikawa-vision.org/vision/BFSflipper/index-e.html

[BFS-Auto: High Speed & High Definition Book Scanner]: https://ishikawa-vision.org/vision/BFS-Auto/index-e.html

[BFS-Solo: High Speed Book Digitization using Monocular Video]: https://ishikawa-vision.org/vision/BFS-Solo/index-e.html

[Book Flipping Scanning]: https://ishikawa-vision.org/vision/BFS/index-e.html

[Digitization of Deformed Documents using a High-speed Multi-camera Array]: https://ishikawa-vision.org/vision/MultiBFS/index-e.html

[Document Digitization and its Quality Improvement using a Multi-camera Array]:  https://ishikawa-vision.org/vision/MultiBFS_boundary/index-e.html

[Document Image Rectification using Advance Knowledge of 3D Deformation]: https://ishikawa-vision.org/vision/BFS_learn/index-e.html

[Estimation of Non-rigid Surface Deformation using Developable Surface Model]: https://ishikawa-vision.org/vision/developable/index-e.html

[High-accuracy rectification technique of deformed document image using Tiled Rectangle Fragments (TRFs)]: https://ishikawa-vision.org/vision/TRF/index-e.html

[High-speed and High-definition Document Digitalization System based on Adaptive Scanning using Real-time 3D Sensing]: https://ishikawa-vision.org/vision/HybridBFS/index-e.html

[Proof-of-concept prototype for Book Flipping Scanning]: https://ishikawa-vision.org/vision/BookFlipScan/index-e.html

[Real-time 3D Page Tracking and Book Status Recognition]: https://ishikawa-vision.org/vision/BFSPageTracking/index-e.html

[Reconstruction of 3D Surface and Restoration of Flat document Image from Monocular Image Sequence]: https://ishikawa-vision.org/vision/MonoBFS/index-e.html

[Vision architecture overview]: https://ishikawa-vision.org/vision/index-e.html

## Devices

- [Archivist] - V-shaped platner based book scanner (open source).

- [book2net] - Book scanners for libraries and archives (commercial).

- [Linear Book Scanner] - Low-cost page-turning book scanner (open source).

- [Portable Scanners] - Several portabal scanning devices (commercial).

[Archivist]: https://diybookscanner.org/archivist/

[book2net]: https://book2net.net/en/products/

[Linear Book Scanner]: https://linearbookscanner.org

[Portable Scanners]: https://www.irislink.com/EN-ROW/c1080/IRIScan---Portable-scanners---Discover-our-range.aspx