{"id":21803149,"url":"https://github.com/litongjava/ppocr-tool","last_synced_at":"2025-10-28T07:15:24.983Z","repository":{"id":195651832,"uuid":"693039341","full_name":"litongjava/ppocr-tool","owner":"litongjava","description":"ppocr-tool","archived":false,"fork":false,"pushed_at":"2023-10-20T13:39:32.000Z","size":4185,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-26T04:09:50.932Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/litongjava.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-18T08:17:28.000Z","updated_at":"2023-09-18T08:17:41.000Z","dependencies_parsed_at":"2023-10-20T14:23:04.868Z","dependency_job_id":null,"html_url":"https://github.com/litongjava/ppocr-tool","commit_stats":null,"previous_names":["litongjava/ppocr-tool"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/litongjava%2Fppocr-tool","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/litongjava%2Fppocr-tool/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/litongjava%2Fppocr-tool/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/litongjava%2Fppocr-tool/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/litongjava","download_url":"https://codeload.github.com/litongjava/ppocr-tool/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244752362,"owners_count":20504256,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-27T11:39:03.444Z","updated_at":"2025-10-28T07:15:24.878Z","avatar_url":"https://github.com/litongjava.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ppocr-tool\n## Intruduction\nppocr-tool是一个款基于PaddlePaddleOCR的命令行工具,使PaddlePalldeOCR的命令更加易用\n## 使用pip安装\npip install ppocr-tool\n\n#源码安装\n### 准备环境\n```\nconda create --name paddle_ocr_cpu_env python=3.8 -y\nactivate paddle_ocr_cpu_env \nor\nconda activate paddle_ocr_cpu_env \n```\n```\npip install paddlepaddle -i https://mirror.baidu.com/pypi/simple\npip install \"paddleocr\u003e=2.0.1\" -i https://mirror.baidu.com/pypi/simple\n```\n### 安装到本地\n```\npip install .\n```\n## 识别\n图片\nwindows\n```\nppocrtool -ot=text -src doc\\imgs\\11.jpg --lang ch\n```\nunix\n```\nppocrtool -ot=text -src doc/imgs/11.jpg --lang ch\n```\n\n识别PDF\nwindows\n```\nppocrtool -ot=text -src doc\\pdfs\\lab_1.pdf --lang ch\n```\nunix\n```\nppocrtool -ot=text -src doc/pdfs/lab_1.pdf --lang ch\n```\n- ch, Chinese\n- en, English\n- fr, French\n- german\n- korean\n- japan`\n\n##识别目录所有文件并将识别结果写入新的文件\n```\nppocrtool --image_dir doc/pdfs --output output_dir\n```\n\n## 模型 \n首次运行会自动下载模型\n中文模型\n- https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar\n- https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_rec_infer.tar\n- https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar\n\n\n英文模型\n- https://paddleocr.bj.bcebos.com/PP-OCRv4/english/en_PP-OCRv4_rec_infer.tar\n\n模型保存位置\n- windows C:\\Users\\Administrator\\.paddleocr\\whl\n- macos /Users/ping/.paddleocr/whl\n\n## 编译成二进制文件\n仅仅在windows上编译和运行成功\n编译成二进制文件\n```\npython build.yml\n```\n测试运行\n```\n测试1\n```\ndist\\ppocr -ot=text -src doc\\imgs\\11.jpg --lang ch\n\n## 查看帮助\n2.支持的参数\n```\nppocrtool --h\n```\n\n```\nusage: ppocrtool.exe [-h] [--use_gpu USE_GPU] [--use_xpu USE_XPU] [--use_npu USE_NPU] [--ir_optim IR_OPTIM] [--use_tensorrt USE_TENSORRT] [--min_subgraph_size MIN_SUBGRAPH_SIZE] [--precision PRECISION]\n                 [--gpu_mem GPU_MEM] [--gpu_id GPU_ID] [--image_dir IMAGE_DIR] [--page_num PAGE_NUM] [--det_algorithm DET_ALGORITHM] [--det_model_dir DET_MODEL_DIR] [--det_limit_side_len DET_LIMIT_SIDE_LEN]\n                 [--det_limit_type DET_LIMIT_TYPE] [--det_box_type DET_BOX_TYPE] [--det_db_thresh DET_DB_THRESH] [--det_db_box_thresh DET_DB_BOX_THRESH] [--det_db_unclip_ratio DET_DB_UNCLIP_RATIO]\n                 [--max_batch_size MAX_BATCH_SIZE] [--use_dilation USE_DILATION] [--det_db_score_mode DET_DB_SCORE_MODE] [--det_east_score_thresh DET_EAST_SCORE_THRESH]\n                 [--det_east_cover_thresh DET_EAST_COVER_THRESH] [--det_east_nms_thresh DET_EAST_NMS_THRESH] [--det_sast_score_thresh DET_SAST_SCORE_THRESH] [--det_sast_nms_thresh DET_SAST_NMS_THRESH]\n                 [--det_pse_thresh DET_PSE_THRESH] [--det_pse_box_thresh DET_PSE_BOX_THRESH] [--det_pse_min_area DET_PSE_MIN_AREA] [--det_pse_scale DET_PSE_SCALE] [--scales SCALES] [--alpha ALPHA]\n                 [--beta BETA] [--fourier_degree FOURIER_DEGREE] [--rec_algorithm REC_ALGORITHM] [--rec_model_dir REC_MODEL_DIR] [--rec_image_inverse REC_IMAGE_INVERSE] [--rec_image_shape REC_IMAGE_SHAPE]\n                 [--rec_batch_num REC_BATCH_NUM] [--max_text_length MAX_TEXT_LENGTH] [--rec_char_dict_path REC_CHAR_DICT_PATH] [--use_space_char USE_SPACE_CHAR] [--vis_font_path VIS_FONT_PATH]\n                 [--drop_score DROP_SCORE] [--e2e_algorithm E2E_ALGORITHM] [--e2e_model_dir E2E_MODEL_DIR] [--e2e_limit_side_len E2E_LIMIT_SIDE_LEN] [--e2e_limit_type E2E_LIMIT_TYPE]\n                 [--e2e_pgnet_score_thresh E2E_PGNET_SCORE_THRESH] [--e2e_char_dict_path E2E_CHAR_DICT_PATH] [--e2e_pgnet_valid_set E2E_PGNET_VALID_SET] [--e2e_pgnet_mode E2E_PGNET_MODE]\n                 [--use_angle_cls USE_ANGLE_CLS] [--cls_model_dir CLS_MODEL_DIR] [--cls_image_shape CLS_IMAGE_SHAPE] [--label_list LABEL_LIST] [--cls_batch_num CLS_BATCH_NUM] [--cls_thresh CLS_THRESH]\n                 [--enable_mkldnn ENABLE_MKLDNN] [--cpu_threads CPU_THREADS] [--use_pdserving USE_PDSERVING] [--warmup WARMUP] [--sr_model_dir SR_MODEL_DIR] [--sr_image_shape SR_IMAGE_SHAPE]\n                 [--sr_batch_num SR_BATCH_NUM] [--draw_img_save_dir DRAW_IMG_SAVE_DIR] [--save_crop_res SAVE_CROP_RES] [--crop_res_save_dir CROP_RES_SAVE_DIR] [--use_mp USE_MP]\n                 [--total_process_num TOTAL_PROCESS_NUM] [--process_id PROCESS_ID] [--benchmark BENCHMARK] [--save_log_path SAVE_LOG_PATH] [--show_log SHOW_LOG] [--use_onnx USE_ONNX] [--output OUTPUT]\n                 [--table_max_len TABLE_MAX_LEN] [--table_algorithm TABLE_ALGORITHM] [--table_model_dir TABLE_MODEL_DIR] [--merge_no_span_structure MERGE_NO_SPAN_STRUCTURE]\n                 [--table_char_dict_path TABLE_CHAR_DICT_PATH] [--layout_model_dir LAYOUT_MODEL_DIR] [--layout_dict_path LAYOUT_DICT_PATH] [--layout_score_threshold LAYOUT_SCORE_THRESHOLD]\n                 [--layout_nms_threshold LAYOUT_NMS_THRESHOLD] [--kie_algorithm KIE_ALGORITHM] [--ser_model_dir SER_MODEL_DIR] [--re_model_dir RE_MODEL_DIR] [--use_visual_backbone USE_VISUAL_BACKBONE]\n                 [--ser_dict_path SER_DICT_PATH] [--ocr_order_method OCR_ORDER_METHOD] [--mode {structure,kie}] [--image_orientation IMAGE_ORIENTATION] [--layout LAYOUT] [--table TABLE] [--ocr OCR]\n                 [--recovery RECOVERY] [--use_pdf2docx_api USE_PDF2DOCX_API] [--invert INVERT] [--binarize BINARIZE] [--alphacolor ALPHACOLOR] [--lang LANG] [--det DET] [--rec REC] [--type TYPE]\n                 [--ocr_version {PP-OCR,PP-OCRv2,PP-OCRv3,PP-OCRv4}] [--structure_version {PP-Structure,PP-StructureV2}]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --use_gpu USE_GPU\n  --use_xpu USE_XPU\n  --use_npu USE_NPU\n  --ir_optim IR_OPTIM\n  --use_tensorrt USE_TENSORRT\n  --min_subgraph_size MIN_SUBGRAPH_SIZE\n  --precision PRECISION\n  --gpu_mem GPU_MEM\n  --gpu_id GPU_ID\n  --image_dir IMAGE_DIR\n  --page_num PAGE_NUM\n  --det_algorithm DET_ALGORITHM\n  --det_model_dir DET_MODEL_DIR\n  --det_limit_side_len DET_LIMIT_SIDE_LEN\n  --det_limit_type DET_LIMIT_TYPE\n  --det_box_type DET_BOX_TYPE\n  --det_db_thresh DET_DB_THRESH\n  --det_db_box_thresh DET_DB_BOX_THRESH\n  --det_db_unclip_ratio DET_DB_UNCLIP_RATIO\n  --max_batch_size MAX_BATCH_SIZE\n  --use_dilation USE_DILATION\n  --det_db_score_mode DET_DB_SCORE_MODE\n  --det_east_score_thresh DET_EAST_SCORE_THRESH\n  --det_east_cover_thresh DET_EAST_COVER_THRESH\n  --det_east_nms_thresh DET_EAST_NMS_THRESH\n  --det_sast_score_thresh DET_SAST_SCORE_THRESH\n  --det_sast_nms_thresh DET_SAST_NMS_THRESH\n  --det_pse_thresh DET_PSE_THRESH\n  --det_pse_box_thresh DET_PSE_BOX_THRESH\n  --det_pse_min_area DET_PSE_MIN_AREA\n  --det_pse_scale DET_PSE_SCALE\n  --scales SCALES\n  --alpha ALPHA\n  --beta BETA\n  --fourier_degree FOURIER_DEGREE\n  --rec_algorithm REC_ALGORITHM\n  --rec_model_dir REC_MODEL_DIR\n  --rec_image_inverse REC_IMAGE_INVERSE\n  --rec_image_shape REC_IMAGE_SHAPE\n  --rec_batch_num REC_BATCH_NUM\n  --max_text_length MAX_TEXT_LENGTH\n  --rec_char_dict_path REC_CHAR_DICT_PATH\n  --use_space_char USE_SPACE_CHAR\n  --vis_font_path VIS_FONT_PATH\n  --drop_score DROP_SCORE\n  --e2e_algorithm E2E_ALGORITHM\n  --e2e_model_dir E2E_MODEL_DIR\n  --e2e_limit_side_len E2E_LIMIT_SIDE_LEN\n  --e2e_limit_type E2E_LIMIT_TYPE\n  --e2e_pgnet_score_thresh E2E_PGNET_SCORE_THRESH\n  --e2e_char_dict_path E2E_CHAR_DICT_PATH\n  --e2e_pgnet_valid_set E2E_PGNET_VALID_SET\n  --e2e_pgnet_mode E2E_PGNET_MODE\n  --use_angle_cls USE_ANGLE_CLS\n  --cls_model_dir CLS_MODEL_DIR\n  --cls_image_shape CLS_IMAGE_SHAPE\n  --label_list LABEL_LIST\n  --cls_batch_num CLS_BATCH_NUM\n  --cls_thresh CLS_THRESH\n  --enable_mkldnn ENABLE_MKLDNN\n  --cpu_threads CPU_THREADS\n  --use_pdserving USE_PDSERVING\n  --warmup WARMUP\n  --sr_model_dir SR_MODEL_DIR\n  --sr_image_shape SR_IMAGE_SHAPE\n  --sr_batch_num SR_BATCH_NUM\n  --draw_img_save_dir DRAW_IMG_SAVE_DIR\n  --save_crop_res SAVE_CROP_RES\n  --crop_res_save_dir CROP_RES_SAVE_DIR\n  --use_mp USE_MP\n  --total_process_num TOTAL_PROCESS_NUM\n  --process_id PROCESS_ID\n  --benchmark BENCHMARK\n  --save_log_path SAVE_LOG_PATH\n  --show_log SHOW_LOG\n  --use_onnx USE_ONNX\n  --output OUTPUT\n  --table_max_len TABLE_MAX_LEN\n  --table_algorithm TABLE_ALGORITHM\n  --table_model_dir TABLE_MODEL_DIR\n  --merge_no_span_structure MERGE_NO_SPAN_STRUCTURE\n  --table_char_dict_path TABLE_CHAR_DICT_PATH\n  --layout_model_dir LAYOUT_MODEL_DIR\n  --layout_dict_path LAYOUT_DICT_PATH\n  --layout_score_threshold LAYOUT_SCORE_THRESHOLD\n                        Threshold of score.\n  --layout_nms_threshold LAYOUT_NMS_THRESHOLD\n                        Threshold of nms.\n  --kie_algorithm KIE_ALGORITHM\n  --ser_model_dir SER_MODEL_DIR\n  --re_model_dir RE_MODEL_DIR\n  --use_visual_backbone USE_VISUAL_BACKBONE\n  --ser_dict_path SER_DICT_PATH\n  --ocr_order_method OCR_ORDER_METHOD\n  --mode {structure,kie}\n                        structure and kie is supported\n  --image_orientation IMAGE_ORIENTATION\n                        Whether to enable image orientation recognition\n  --layout LAYOUT       Whether to enable layout analysis\n  --table TABLE         In the forward, whether the table area uses table recognition\n  --ocr OCR             In the forward, whether the non-table area is recognition by ocr\n  --recovery RECOVERY   Whether to enable layout of recovery\n  --use_pdf2docx_api USE_PDF2DOCX_API\n                        Whether to use pdf2docx api\n  --invert INVERT       Whether to invert image before processing\n  --binarize BINARIZE   Whether to threshold binarize image before processing\n  --alphacolor ALPHACOLOR\n                        Replacement color for the alpha channel, if the latter is present; R,G,B integers\n  --lang LANG\n  --det DET\n  --rec REC\n  --type TYPE\n  --ocr_version {PP-OCR,PP-OCRv2,PP-OCRv3,PP-OCRv4}\n                        OCR Model version, the current model support list is as follows: 1. PP-OCRv4/v3 Support Chinese and English detection and recognition model, and direction classifier model2. PP-OCRv2\n                        Support Chinese detection and recognition model. 3. PP-OCR support Chinese detection, recognition and direction classifier and multilingual recognition model.\n  --structure_version {PP-Structure,PP-StructureV2}\n                        Model version, the current model support list is as follows: 1. PP-Structure Support en table structure model. 2. PP-StructureV2 Support ch and en table structure model. \n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flitongjava%2Fppocr-tool","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flitongjava%2Fppocr-tool","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flitongjava%2Fppocr-tool/lists"}