Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/janis91/ocr
Nextcloud OCR (optical character recoginition) processing for images with tesseract-js
https://github.com/janis91/ocr
nextcloud nextcloud-app nextcloud-ocr ocr-processing tesseract-js tesseract-ocr
Last synced: about 1 month ago
JSON representation
Nextcloud OCR (optical character recoginition) processing for images with tesseract-js
- Host: GitHub
- URL: https://github.com/janis91/ocr
- Owner: janis91
- License: agpl-3.0
- Created: 2016-07-14T08:31:28.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2024-04-14T02:05:27.000Z (8 months ago)
- Last Synced: 2024-04-14T17:24:50.173Z (8 months ago)
- Topics: nextcloud, nextcloud-app, nextcloud-ocr, ocr-processing, tesseract-js, tesseract-ocr
- Language: JavaScript
- Homepage:
- Size: 3.38 MB
- Stars: 105
- Watchers: 8
- Forks: 17
- Open Issues: 22
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Authors: AUTHORS.md
Awesome Lists containing this project
- awesome-nextcloud - ocr - OCR processing for images and PDF in NC (Apps / Unofficial)
- awesome-ocr - Nextcloud OCR (optical character recoginition) processing for images and PDF with tesseract-ocr, OCRmyPDF and php-native message queueing for asynchronous purpose. http://janis91.github.io/ocr/
README
# OCR
[![Build Status](https://travis-ci.org/janis91/ocr.svg?branch=master)](https://travis-ci.org/janis91/ocr) [![Total alerts](https://img.shields.io/lgtm/alerts/g/janis91/ocr.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/janis91/ocr/alerts/) [![Codacy Badge](https://api.codacy.com/project/badge/Grade/96e643bf329d473e9968b20ba4f11a50)](https://www.codacy.com/app/janis91/ocr?utm_source=github.com&utm_medium=referral&utm_content=janis91/ocr&utm_campaign=Badge_Grade) [![Codacy Badge](https://api.codacy.com/project/badge/Coverage/96e643bf329d473e9968b20ba4f11a50)](https://www.codacy.com/manual/janis91/ocr?utm_source=github.com&utm_medium=referral&utm_content=janis91/ocr&utm_campaign=Badge_Coverage) [![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](http://www.gnu.org/licenses/agpl-3.0)Nextcloud OCR (optical character recognition) processing for images with tesseract-js brings OCR capability to your Nextcloud.
The app uses [tesseract-js](https://tesseract.projectnaptha.com/) by [@jeromewu](https://github.com/jeromewu) in the browser in order to process images (png, jpeg, tiff, bmp) and saves the output PDF file to the source folder in nextcloud. That for example enables you to search in it.## Prerequisites, Requirements and Dependencies
The OCR app has some prerequisites:
- [Nextcloud 16 and up](https://nextcloud.com/)
- Only supported on latest modern web browsers (Chrome, Edge, Firefox, Opera, Safari*)
- Tesseract traineddata needs about 200 MB space on your server (will be installed automatically)._* On Safari there is currently a problem with the [Content-Security-Policy](https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP), that requires an Administrator to set the 'script-src' to 'unsafe-eval' such that the app works properly. Because this is quite insecure the app itself does not set it and recommends to decide that on your own risk (please make sure, that you know what CSP is and what e.g. unsafe-eval causes)._
## Installation
Install the app from the [Nextcloud AppStore](https://apps.nextcloud.com/apps/ocr) or download the release package from github (**NOT** the sources) and place the content in **nextcloud/apps/ocr/**.## Disclaimer
The software is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
CONDITIONS OF ANY KIND, either express or implied.## Note
The version 3 and earlier versions are not supported/maintained anymore by the author. So for asynchronous background processing please fork the repository and use the "not-maintained" branch to work on improvements. The author wasn't able to support it because of too much effort.
Moreover this project is based on a webassembly port of tesseract. The maintainer stopped working on PDF processing in this app and will start working on separate app for pdf handling.