{"id":23406935,"url":"https://github.com/docsaidlab/capybara","last_synced_at":"2026-01-16T19:40:52.209Z","repository":{"id":268889218,"uuid":"905147331","full_name":"DocsaidLab/Capybara","owner":"DocsaidLab","description":"OpenCV and ONNX Runtime Inference Toolkit","archived":false,"fork":false,"pushed_at":"2025-02-11T06:15:41.000Z","size":20527,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-02-11T06:34:45.586Z","etag":null,"topics":["onnxruntime","opencv","python","toolbox"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DocsaidLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-18T08:57:30.000Z","updated_at":"2025-02-11T06:15:45.000Z","dependencies_parsed_at":"2024-12-19T14:53:27.046Z","dependency_job_id":null,"html_url":"https://github.com/DocsaidLab/Capybara","commit_stats":null,"previous_names":["docsaidlab/capybara"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DocsaidLab%2FCapybara","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DocsaidLab%2FCapybara/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DocsaidLab%2FCapybara/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DocsaidLab%2FCapybara/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DocsaidLab","download_url":"https://codeload.github.com/DocsaidLab/Capybara/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238895192,"owners_count":19548550,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["onnxruntime","opencv","python","toolbox"],"created_at":"2024-12-22T14:16:18.956Z","updated_at":"2026-01-16T19:40:52.195Z","avatar_url":"https://github.com/DocsaidLab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"**[English](./README.md)** | [Chinese](./README_tw.md)\n\n# Capybara\n\n\u003cp align=\"left\"\u003e\n    \u003ca href=\"./LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-Apache%202-dfd.svg\"\u003e\u003c/a\u003e\n    \u003ca href=\"\"\u003e\u003cimg src=\"https://img.shields.io/badge/python-3.10+-aff.svg\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://github.com/DocsaidLab/Capybara/releases\"\u003e\u003cimg src=\"https://img.shields.io/github/v/release/DocsaidLab/Capybara?color=ffa\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://pypi.org/project/capybara-docsaid/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/capybara-docsaid.svg\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://pypi.org/project/capybara-docsaid/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/dm/capybara-docsaid?color=9cf\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n![title](https://raw.githubusercontent.com/DocsaidLab/Capybara/refs/heads/main/docs/title.webp)\n\n---\n\n## Introduction\n\nCapybara is designed with three goals:\n\n1. **Lightweight default install**: `pip install capybara-docsaid` installs only the core `utils/structures/vision` modules, without forcing heavy inference dependencies.\n2. **Inference backends as opt-in extras**: install ONNX Runtime / OpenVINO / TorchScript only when you need them via extras.\n3. **Lower risk**: enforce quality gates with ruff/pyright/pytest and target **90%** line coverage for the core codebase.\n\nWhat you get:\n\n- **Image tools** (`capybara.vision`): I/O, color conversion, resize/rotate/pad/crop, and video frame extraction.\n- **Geometry structures** (`capybara.structures`): `Box/Boxes`, `Polygon/Polygons`, `Keypoints`, plus helper functions like IoU.\n- **Inference wrappers (optional)**: `capybara.onnxengine` / `capybara.openvinoengine` / `capybara.torchengine`.\n- **Feature extras (optional)**: `visualization` (drawing tools), `ipcam` (simple web demo), `system` (system info tools).\n- **Utilities** (`capybara.utils`): `PowerDict`, `Timer`, `make_batch`, `download_from_google`, and other common helpers.\n\n## Quick Start\n\n### Install and verify\n\n```bash\npip install capybara-docsaid\npython -c \"import capybara; print(capybara.__version__)\"\n```\n\n## Documentation\n\nTo learn more about installation and usage, see [**Capybara Documents**](https://docsaid.org/docs/capybara).\n\nThe documentation includes detailed guides and common FAQs for this project.\n\n## Installation\n\n### Core install (lightweight)\n\n```bash\npip install capybara-docsaid\n```\n\n### Enable inference backends (optional)\n\n```bash\n# ONNX Runtime (CPU)\npip install \"capybara-docsaid[onnxruntime]\"\n\n# ONNX Runtime (GPU)\npip install \"capybara-docsaid[onnxruntime-gpu]\"\n\n# OpenVINO runtime\npip install \"capybara-docsaid[openvino]\"\n\n# TorchScript runtime\npip install \"capybara-docsaid[torchscript]\"\n\n# Install everything\npip install \"capybara-docsaid[all]\"\n```\n\n### Feature extras (optional)\n\n```bash\n# Visualization (matplotlib/pillow)\npip install \"capybara-docsaid[visualization]\"\n\n# IPCam app (flask)\npip install \"capybara-docsaid[ipcam]\"\n\n# System info (psutil)\npip install \"capybara-docsaid[system]\"\n```\n\n### Combine multiple extras\n\nIf you want OpenVINO inference and the IPCam features, install:\n\n```bash\n# OpenVINO + IPCam\npip install \"capybara-docsaid[openvino,ipcam]\"\n```\n\n### Install from Git\n\n```bash\npip install git+https://github.com/DocsaidLab/Capybara.git\n```\n\n## System Dependencies (Install as needed)\n\nSome features require OS-level codecs / image I/O / PDF tools (install as needed):\n\n- `PyTurboJPEG` (faster JPEG I/O): requires the TurboJPEG library.\n- `pillow-heif` (HEIC/HEIF support): requires libheif.\n- `pdf2image` (PDF to images): requires Poppler.\n- Video frame extraction: installing `ffmpeg` is recommended (more stable OpenCV video decoding).\n\n### Ubuntu\n\n```bash\nsudo apt install ffmpeg libturbojpeg libheif-dev poppler-utils\n```\n\n### macOS\n\n```bash\nbrew install jpeg-turbo ffmpeg libheif poppler\n```\n\n### GPU Notes (ONNX Runtime CUDA)\n\nIf you're using `onnxruntime-gpu`, install the compatible CUDA/cuDNN version for your ORT version:\n\n- See [**the ONNX Runtime documentation**](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements)\n\n## Usage\n\n### Image data conventions\n\n- Capybara images are represented as `numpy.ndarray`. By default, they follow OpenCV conventions: **BGR**, and shape is typically `(H, W, 3)`.\n- If you prefer working in RGB, use `imread(..., color_base=\"RGB\")` or convert with `imcvtcolor(img, \"BGR2RGB\")`.\n\n### Image I/O\n\n```python\nfrom capybara import imread, imwrite\n\nimg = imread(\"your_image.jpg\")\nif img is None:\n    raise RuntimeError(\"Failed to read image.\")\n\nimwrite(img, \"out.jpg\")\n```\n\nNotes:\n\n- `imread` returns `None` when it fails to decode an image (if the path doesn't exist, it raises `FileExistsError`).\n- `imread` also supports `.heic` (requires `pillow-heif` + OS-level libheif).\n\n### Resize / pad\n\nWith `imresize`, you can pass `None` in `size` to keep the aspect ratio and have the other dimension inferred automatically.\n\n```python\nimport numpy as np\nfrom capybara import BORDER, imresize, pad\n\nimg = np.zeros((480, 640, 3), dtype=np.uint8)\nimg = imresize(img, (320, None))  # (height, width)\nimg = pad(img, pad_size=(8, 8), pad_mode=BORDER.REPLICATE)\n```\n\n### Color conversion\n\n```python\nimport numpy as np\nfrom capybara import imcvtcolor\n\nimg = np.zeros((240, 320, 3), dtype=np.uint8)  # BGR\ngray = imcvtcolor(img, \"BGR2GRAY\")             # grayscale\nrgb = imcvtcolor(img, \"BGR2RGB\")               # RGB\n```\n\n### Rotation / perspective correction\n\n```python\nimport numpy as np\nfrom capybara import Polygon, imrotate, imwarp_quadrangle\n\nimg = np.zeros((240, 320, 3), dtype=np.uint8)\nrot = imrotate(img, angle=15, expand=True)  # Angle definition matches OpenCV: positive values rotate counterclockwise\n\npoly = Polygon([[10, 10], [200, 20], [190, 120], [20, 110]])\npatch = imwarp_quadrangle(img, poly)        # 4-point perspective warp\n```\n\n### Cropping (Box / Boxes)\n\n```python\nimport numpy as np\nfrom capybara import Box, Boxes, imcropbox, imcropboxes\n\nimg = np.zeros((240, 320, 3), dtype=np.uint8)\ncrop1 = imcropbox(img, Box([10, 20, 110, 120]), use_pad=True)\ncrop_list = imcropboxes(\n    img,\n    Boxes([[0, 0, 10, 10], [100, 100, 400, 300]]),\n    use_pad=True,\n)\n```\n\n### Binarization + morphology\n\nMorphology operators live in `capybara.vision.morphology` (not in the top-level `capybara` namespace).\n\n```python\nimport numpy as np\nfrom capybara import imbinarize\nfrom capybara.vision.morphology import imopen\n\nimg = np.zeros((240, 320, 3), dtype=np.uint8)\nmask = imbinarize(img)        # OTSU + binary\nmask = imopen(mask, ksize=3)  # Opening to remove small noise\n```\n\n### Boxes / IoU\n\n```python\nimport numpy as np\nfrom capybara import Box, Boxes, pairwise_iou\n\nboxes_a = Boxes([[10, 10, 20, 20], [30, 30, 60, 60]])\nboxes_b = Boxes(np.array([[12, 12, 18, 18]], dtype=np.float32))\nprint(pairwise_iou(boxes_a, boxes_b))\n\nbox = Box([0.1, 0.2, 0.9, 0.8], is_normalized=True).convert(\"XYWH\")\nprint(box.numpy())\n```\n\n### Polygons / IoU\n\n```python\nfrom capybara import Polygon, polygon_iou\n\np1 = Polygon([[0, 0], [10, 0], [10, 10], [0, 10]])\np2 = Polygon([[5, 5], [15, 5], [15, 15], [5, 15]])\nprint(polygon_iou(p1, p2))\n```\n\n### Base64 (image / ndarray)\n\n```python\nimport numpy as np\nfrom capybara import img_to_b64str, npy_to_b64str\nfrom capybara.vision.improc import b64str_to_img, b64str_to_npy\n\nimg = np.zeros((32, 32, 3), dtype=np.uint8)\nb64_img = img_to_b64str(img)          # JPEG bytes -\u003e base64 string\nif b64_img is None:\n    raise RuntimeError(\"Failed to encode image into base64.\")\nimg2 = b64str_to_img(b64_img)         # base64 string -\u003e numpy image\n\nvec = np.arange(8, dtype=np.float32)\nb64_vec = npy_to_b64str(vec)\nvec2 = b64str_to_npy(b64_vec, dtype=\"float32\")\n```\n\n### PDF to images\n\n```python\nfrom capybara.vision.improc import pdf2imgs\n\npages = pdf2imgs(\"file.pdf\")  # list[np.ndarray], each page is BGR image\nif pages is None:\n    raise RuntimeError(\"Failed to decode PDF.\")\nprint(len(pages))\n```\n\n### Visualization (optional)\n\nInstall first: `pip install \"capybara-docsaid[visualization]\"`.\n\n```python\nimport numpy as np\nfrom capybara import Box\nfrom capybara.vision.visualization.draw import draw_box\n\nimg = np.zeros((240, 320, 3), dtype=np.uint8)\nimg = draw_box(img, Box([10, 20, 100, 120]))\n```\n\n### IPCam (optional)\n\n`IpcamCapture` itself does not depend on Flask; you only need the `ipcam` extra to use `WebDemo`.\n\n```python\nfrom capybara.vision.ipcam.camera import IpcamCapture\n\ncap = IpcamCapture(url=0, color_base=\"BGR\")  # or provide an RTSP/HTTP URL\nframe = next(cap)\n```\n\nWeb demo (install first: `pip install \"capybara-docsaid[ipcam]\"`):\n\n```python\nfrom capybara.vision.ipcam.app import WebDemo\n\nWebDemo(\"rtsp://\u003cipcam-url\u003e\").run(port=5001)\n```\n\n### System info (optional)\n\nInstall first: `pip install \"capybara-docsaid[system]\"`.\n\n```python\nfrom capybara.utils.system_info import get_system_info\n\nprint(get_system_info())\n```\n\n### Video frame extraction\n\n```python\nfrom capybara import video2frames_v2\n\nframes = video2frames_v2(\"demo.mp4\", frame_per_sec=2, max_size=1280)\nprint(len(frames))\n```\n\n## Inference Backends\n\nInference backends are optional; install the corresponding extras before importing the relevant engine modules.\n\n### Runtime / backend matrix\n\nNote: TorchScript runtime is named `Runtime.pt` in code (corresponding extra: `torchscript`).\n\n| Runtime (`capybara.runtime.Runtime`) | Backend name    | Provider / device                                                                                           |\n| ------------------------------------ | --------------- | ----------------------------------------------------------------------------------------------------------- |\n| `onnx`                               | `cpu`           | `[\"CPUExecutionProvider\"]`                                                                                  |\n| `onnx`                               | `cuda`          | `[\"CUDAExecutionProvider\"(device_id), \"CPUExecutionProvider\"]`                                              |\n| `onnx`                               | `tensorrt`      | `[\"TensorrtExecutionProvider\"(device_id), \"CUDAExecutionProvider\"(device_id), \"CPUExecutionProvider\"]`      |\n| `onnx`                               | `tensorrt_rtx`  | `[\"NvTensorRTRTXExecutionProvider\"(device_id), \"CUDAExecutionProvider\"(device_id), \"CPUExecutionProvider\"]` |\n| `openvino`                           | `cpu`           | `device=\"CPU\"`                                                                                              |\n| `openvino`                           | `gpu`           | `device=\"GPU\"`                                                                                              |\n| `openvino`                           | `npu`           | `device=\"NPU\"`                                                                                              |\n| `pt`                                 | `cpu`           | `torch.device(\"cpu\")`                                                                                       |\n| `pt`                                 | `cuda`          | `torch.device(\"cuda\")`                                                                                      |\n\n### Runtime registry (auto backend selection)\n\n```python\nfrom capybara.runtime import Runtime\n\nprint(Runtime.onnx.auto_backend_name())      # Priority: cuda -\u003e tensorrt_rtx -\u003e tensorrt -\u003e cpu\nprint(Runtime.openvino.auto_backend_name())  # Priority: gpu -\u003e npu -\u003e cpu\nprint(Runtime.pt.auto_backend_name())        # Priority: cuda -\u003e cpu\n```\n\n### ONNX Runtime (`capybara.onnxengine`)\n\n```python\nimport numpy as np\nfrom capybara.onnxengine import EngineConfig, ONNXEngine\n\nengine = ONNXEngine(\n    \"model.onnx\",\n    backend=\"cpu\",\n    config=EngineConfig(enable_io_binding=False),\n)\noutputs = engine.run({\"input\": np.ones((1, 3, 224, 224), dtype=np.float32)})\nprint(outputs.keys())\nprint(engine.summary())\n```\n\n### OpenVINO (`capybara.openvinoengine`)\n\n```python\nimport numpy as np\nfrom capybara.openvinoengine import OpenVINOConfig, OpenVINODevice, OpenVINOEngine\n\nengine = OpenVINOEngine(\n    \"model.xml\",\n    device=OpenVINODevice.cpu,\n    config=OpenVINOConfig(num_requests=2),\n)\noutputs = engine.run({\"input\": np.ones((1, 3), dtype=np.float32)})\nprint(outputs.keys())\n```\n\n### TorchScript (`capybara.torchengine`)\n\n```python\nimport numpy as np\nfrom capybara.torchengine import TorchEngine\n\nengine = TorchEngine(\"model.pt\", device=\"cpu\")\noutputs = engine.run({\"image\": np.zeros((1, 3, 224, 224), dtype=np.float32)})\nprint(outputs.keys())\n```\n\n### Benchmark (depends on hardware)\n\nAll engines provide `benchmark(...)` for quick throughput/latency measurements.\n\n```python\nimport numpy as np\nfrom capybara.onnxengine import ONNXEngine\n\nengine = ONNXEngine(\"model.onnx\", backend=\"cpu\")\ndummy = np.zeros((1, 3, 224, 224), dtype=np.float32)\nprint(engine.benchmark({\"input\": dummy}, repeat=50, warmup=5))\n```\n\n### Advanced: Custom options (optional)\n\n`EngineConfig` / `OpenVINOConfig` / `TorchEngineConfig` are passed through to the underlying runtime as-is.\n\n```python\nfrom capybara.onnxengine import EngineConfig, ONNXEngine\n\nengine = ONNXEngine(\n    \"model.onnx\",\n    backend=\"cuda\",\n    config=EngineConfig(\n        provider_options={\n            \"CUDAExecutionProvider\": {\n                \"enable_cuda_graph\": True,\n            },\n        },\n    ),\n)\n```\n\n## Quality Gates (Contributors)\n\nBefore merging, this project requires:\n\n```bash\nruff check .\nruff format --check .\npyright\npython -m pytest --cov=capybara --cov-config=.coveragerc --cov-report=term\n```\n\nNotes:\n\n- Coverage gate is **90% line coverage** (rules defined in `.coveragerc`).\n- Heavy / environment-dependent modules are excluded from the default coverage gate to keep CI reproducible and maintainable.\n\n## Docker (optional)\n\n```bash\ngit clone https://github.com/DocsaidLab/Capybara.git\ncd Capybara\nbash docker/build.bash\n```\n\nRun:\n\n```bash\ndocker run --rm -it capybara_docsaid bash\n```\n\nIf you need GPU access inside the container, use the NVIDIA container runtime (e.g. `--gpus all`).\n\n## Testing (local)\n\n```bash\npython -m pytest -vv\n```\n\n## License\n\nApache-2.0, see `LICENSE`.\n\n## Citation\n\n```bibtex\n@misc{lin2025capybara,\n  author       = {Kun-Hsiang Lin*, Ze Yuan*},\n  title        = {Capybara: An Integrated Python Package for Image Processing and Deep Learning.},\n  year         = {2025},\n  publisher    = {GitHub},\n  howpublished = {\\\\url{https://github.com/DocsaidLab/Capybara}},\n  note         = {* equal contribution}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdocsaidlab%2Fcapybara","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdocsaidlab%2Fcapybara","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdocsaidlab%2Fcapybara/lists"}