Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/matbgn/auto-scan

Auto-scan is a tool to automatically scan documents from printer/scanner, OCR files, optimize and rotate it and finally send it by email as attachment.
https://github.com/matbgn/auto-scan

Last synced: 3 months ago
JSON representation

Auto-scan is a tool to automatically scan documents from printer/scanner, OCR files, optimize and rotate it and finally send it by email as attachment.

Awesome Lists containing this project

README

        

= Auto-scan docs and send them
:icons: font
ifdef::env-github[]
:tip-caption: :bulb:
:note-caption: :information_source:
:important-caption: :heavy_exclamation_mark:
:caution-caption: :fire:
:warning-caption: :warning:
endif::[]

== Description
Auto-scan is a tool to automatically scan documents from HP, OCR files, optimize it and finally send it by email as attachment.

## Architectures

Based on python 3.6 slim buster it supports

* [x] `linux/386`
* [x] `linux/amd64`
* [x] `linux/arm/v5`
* [x] `linux/arm/v7`
* [x] `linux/arm64/v8`
* [x] `linux/ppc64le`
* [x] `linux/s390x`

== Prerequisites
In order to work properly the container provided here needs access to DBus. Actual shortcut is to run it as privileged Docker container and map host socket to gain access to DBus (see launch script below).

== How to get started
Git clone actual repo:
```bash
git clone https://github.com/matbgn/auto-scan.git
```

Git Shallow Clone the scan-pdf repo adapted for HP scanners:
```bash
cd auto-scan
git clone --depth=1 https://github.com/matbgn/scan-pdf.git
```

Build docker image locally or run it in dev mode (see below):
```dockerfile
docker build -t matbgn/auto_scan .
```

=== Build for other platforms
_source: https://medium.com/@artur.klauser/building-multi-architecture-docker-images-with-buildx-27d80f7e2408_

. Make sure you have a sufficient kernel on your Linux machine >= 4.8

$ uname -r
4.15.0

. Make sure your docker version is greater than 19.03

$ docker --version
Docker version 19.03.5, build 633a0ea83

. Make sure buildx is activated (if not run `export DOCKER_CLI_EXPERIMENTAL=enabled`)

$ docker buildx version
github.com/docker/buildx v0.6.3-docker 266c0eac611d64fcc0c72d80206aa364e826758d

. Use a docker image based installation to have QEMU and binfmt-support up to date

$ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes

NOTE: Pay attention to the fact that you’ll need to re-run that docker image after every system reboot.

[start=5]
. Create a buildx builder

$ docker buildx create --name mybuilder
mybuilder
$ docker buildx use mybuilder

.. You can check your newly created mybuilder with:

$ docker buildx ls
NAME/NODE DRIVER/ENDPOINT STATUS PLATFORMS
mybuilder * docker-container
mybuilder0 unix:///var/run/docker.sock running linux/amd64, linux/arm64, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
default docker
default default running linux/amd64, linux/386, linux/arm64, linux/riscv64, linux/ppc64le, linux/s390x, linux/arm/v7, linux/arm/v6

WARNING: If it only reports support for linux/amd64 and linux/386 you either still haven’t met all software requirements, or you had created a builder before you have met the software requirements. In the latter case remove it with docker buildx rm and recreate it.

[start=6]
. Finally build it locally in a tar to be loaded later on the target machine

docker buildx build -t matbgn/auto_scan --platform linux/arm64 -o type=docker,dest=- . > auto_scan_arm64.tar

.. To load an image on the target machine, simply run

docker load < auto_scan_arm64.tar

... Then double-check the list of your pulled images with `docker images`

== Usage

Printer address in following format:
```
PRINTER_ADDRESS="hpaio:/net/OfficeJet_Pro_7740_series?ip=192.168.14.113"
```

Scanning mode supported:
```
# ADF|Duplex|Flatbed
SCAN_MODE=ADF
```

Run multiple batches and merge them in one single PDF:
```
# This argument is optionnal
BATCH_TOTAL=2
```

Change paper format scanning (A3, A4, A5 supported):
```
# This argument is optionnal (A4 will be selected as default)
PAPER_FORMAT="A3"
```

Run script:
```dockerfile
sudo docker run -it --rm --net=host --privileged \
-v /var/run/dbus/system_bus_socket:/var/run/dbus/system_bus_socket \
-e [email protected] \
-e SMTPSERVER_PASS=your_gmail_app_secret \
-e EMAIL_RECIPIENTS="[email protected];[email protected]" \
-e SCAN_MODE=ADF \
-e SUBJECT="Test scan" \
-e BATCH_TOTAL=1 \
-e PAPER_FORMAT="A4" \
-e PRINTER_ADDRESS="hpaio:/net/OfficeJet_Pro_7740_series?ip=192.168.14.113" \
matbgn/auto_scan
```

== Development mode

. Be sure to have those packages installed:

sudo apt-get install -y sane-utils libsane-hpaio \
imagemagick ocrmypdf \
tesseract-ocr-fra tesseract-ocr-deu

. (Run venv &) Install requirements with:

pip install -r requirements.txt

. Grab scan_pdf depedency

cd auto-scan
git clone --depth=1 https://github.com/matbgn/scan-pdf.git
cd ..

. Put your variables in a .env file with below possibilities:

[email protected]
SMTPSERVER_PASS=your_gmail_app_secret
SMTPSERVER_HOST=smtp.gmail.com
EMAIL_RECIPIENTS=""[email protected];[email protected]"
SCAN_MODE=Flatbed
SUBJECT="Test scan"
BATCH_TOTAL=1
PAPER_FORMAT="A4"
PRINTER_ADDRESS="hpaio:/net/OfficeJet_Pro_7740_series?ip=192.168.14.113"
PAPERLESS_LOCATION="~/paperless/consume"

. Then run script directly with:

python ./main.py

=== In case of error

In case of following error proceed as below:

[WARNING]
=====================
*Error during converting jpg to pdf*
convert-im6.q16: attempt to perform an operation not allowed by the security policy `PDF' @ error/constitute.c/IsCoderAuthorized/408.
=====================

As a temporary fix, edit /etc/ImageMagick-6/policy.xml and change the PDF rights down in the document from _none_ to _read|write_ there.
Or simply run:
```
sed -i 's/rights="none" pattern="PDF"/rights="read|write" pattern="PDF"/' /etc/ImageMagick-6/policy.xml
```

Not sure about the implications, but at least it allows getting things done.

== Deploy on a server

Follow the first two points of the above development mode on your server (or just locally).

Then simply run (on any port wanted):

waitress-serve --port=4040 app:app &

As a service:

_See https://www.devdungeon.com/content/creating-systemd-service-files_

```
# /etc/systemd/system/auto_scan.service
[Unit]
Description=Auto scan server
After=network.target

[Service]
Type=simple
User=USER_TO_BE_USED
WorkingDirectory=/home/USER_TO_BE_USED/auto-scan
ExecStart=/path/to/venv/bin/waitress-serve --port=4040 app:app
Restart=always

[Install]
WantedBy=multi-user.target
```