{"id":23854056,"url":"https://github.com/lewangdev/paddlewebocr","last_synced_at":"2026-05-26T07:30:17.278Z","repository":{"id":44781688,"uuid":"431872806","full_name":"lewangdev/PaddleWebOCR","owner":"lewangdev","description":"开源的中英文离线 OCR，使用 PaddleOCR 实现，提供了简单的 Web 页面及接口","archived":false,"fork":false,"pushed_at":"2022-05-24T02:55:23.000Z","size":13892,"stargazers_count":119,"open_issues_count":1,"forks_count":30,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-01-02T23:51:55.580Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Vue","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lewangdev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-11-25T14:23:41.000Z","updated_at":"2024-12-17T02:52:46.000Z","dependencies_parsed_at":"2022-08-12T11:21:52.052Z","dependency_job_id":null,"html_url":"https://github.com/lewangdev/PaddleWebOCR","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lewangdev%2FPaddleWebOCR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lewangdev%2FPaddleWebOCR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lewangdev%2FPaddleWebOCR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lewangdev%2FPaddleWebOCR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lewangdev","download_url":"https://codeload.github.com/lewangdev/PaddleWebOCR/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240145107,"owners_count":19755017,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-02T23:51:38.511Z","updated_at":"2026-05-26T07:30:17.142Z","avatar_url":"https://github.com/lewangdev.png","language":"Vue","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PaddleWebOCR\n\n开源的中英文离线 OCR，使用 [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)  实现，提供了简单的 Web 页面及接口。\n\nAn opensource offline multi-languages OCR system shipped with RESTful api and web page.\n\n## 介绍\n\n**使用了开源的 PaddleOCR 并内置了多个模型，可以在离线环境下运行，并且相关资料丰富便于自行训练模型。PaddleOCR 本身支持中文简体繁体，英文，韩文等等多种语言，本项目只内置了中英文（简体中文和繁体中文）的模型，如需要识别其它语言，可以参考本项目调整模型。**\n\n\n![web页面](https://github.com/lewangdev/PaddleWebOCR/blob/main/images/webui.png?raw=true)  \n\n\n## 特性\n\n* 中文简体/繁体，英语等多语种识别\n\n## 安装需求  \n \n### 运行平台  \n\n* ✔ Python 3.7+  \n* ✔ Windows 10/11\n* ✔ CentOS 7   \n* ✔ MacOS Big Sur \n* ✔ Docker   \n\nWindows、CentOS 和 MacOS 系统下在安装好依赖之后可以直接运行，目前只构建了 paddlepaddle 的 CPU 版本，不支持 GPU。也过通过构建 Docker 镜像或者直接从 DockerHub 拉去镜像来使用。\n\n### 最低配置要求  \n\n* CPU:    2 核  \n* 内存:    4GB  \n\n## 安装说明  \n\n### 服务器部署\n\n0. 安装好 nodejs 之后进入 webui 目录\n\n```\ncd webui\nnpm install\nnpm run build\n```\n\n1. 安装 python 3.7  \n    \n2. 安装依赖包  \n\n``` shell script\npip install -r requirements.txt\n```  \n\n3. 运行，项目默认运行在 8080 端口：  \n\n``` shell script\nuvicorn paddlewebocr.main:app --host 0.0.0.0 --port 8080\n\n\n#或者\n\nPYTHONPATH=\"${PYTHONPATH}:.\" python paddlewebocr/main.py [--port=8080]\n\n```\n\n### Docker 部署  \n\n\n推荐从 DockerHub pull 运行镜像\n\n```shell script\ndocker run -d -p 8080:8080 -v ${PWD}/logs:/app/logs --name paddlewebocr lewangdev/paddlewebocr:latest\n```  \n\n使用脚本构建本地镜像（因为要编译 GCC，整个构建过程非常漫长）\n\n```shell script\n# Dockerfile 构建\n./build-docker-image.sh\n\n# 运行镜像\ndocker run -d -p 8080:8080 -v ${PWD}/logs:/app/logs --name paddlewebocr paddlewebocr:latest \n```  \n  \n\n## 接口调用示例  \n\n* Python 使用 File 上传文件  \n\n``` python\nimport requests\nurl = 'http://127.0.0.1:8080/api/ocr'\nimg1_file = {\n    'img_upload': open('img1.png', 'rb')\n}\nres = requests.post(url=url, data={'compress': 0}, files=img1_file)\n```  \n\n* Python 使用 Base64  \n\n``` python\nimport requests\nimport base64\n\n\ndef img_to_base64(img_path):\n    with open(img_path, 'rb')as read:\n        b64 = base64.b64encode(read.read())\n    return b64\n\n\nurl = 'http://127.0.0.1:8080/api/ocr'\nimg_b64 = img_to_base64('./img1.png')\nres = requests.post(url=url, data={'img_b64': img_b64})\n\n```\n\n## 效果展示  \n\n![英文文档识别](https://github.com/lewangdev/PaddleWebOCR/blob/main/images/doc-1.png?raw=true)  \n\n![中文文档识别](https://github.com/lewangdev/PaddleWebOCR/blob/main/images/doc-2.png?raw=true)  \n\n![验证码识别](https://github.com/lewangdev/PaddleWebOCR/blob/main/images/verifycode-1.png?raw=true)\n\n![验证码识别](https://github.com/lewangdev/PaddleWebOCR/blob/main/images/verifycode-2.png?raw=true)\n\n![火车票](https://github.com/lewangdev/PaddleWebOCR/blob/main/images/train-ticket-1.png?raw=true)\n\n![火车票](https://github.com/lewangdev/PaddleWebOCR/blob/main/images/train-ticket-2.png?raw=true)\n\n![发票](https://github.com/lewangdev/PaddleWebOCR/blob/main/images/fapiao-1.png?raw=true)\n\n![身份证](https://github.com/lewangdev/PaddleWebOCR/blob/main/images/idcard-1.png?raw=true)\n\n![海报](https://github.com/lewangdev/PaddleWebOCR/blob/main/images/haibao-1.png?raw=true)\n\n## 更新记录  \n\n[查看更新记录](https://github.com/lewangdev/PaddleWebOCR/releases)\n\n\n## 致谢\n\n本项目参考了 [TrWebOCR](https://github.com/alisen39/TrWebOCR)，由于 TrWebOCR 启动时需要联网并且它使用的 [Tr](https://github.com/myhub/tr) 相关的资料比较少，故而尝试使用 [paddlepaddle](https://github.com/PaddlePaddle/Paddle) 和 [paddleocr](https://github.com/PaddlePaddle/PaddleOCR) 来替换 Tr， 从而有了本项目。\n\n\n## License  \n\nApache 2.0\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flewangdev%2Fpaddlewebocr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flewangdev%2Fpaddlewebocr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flewangdev%2Fpaddlewebocr/lists"}