{"id":36958381,"url":"https://github.com/6ixgodd/inferflow","last_synced_at":"2026-01-13T16:01:53.144Z","repository":{"id":327566200,"uuid":"1109691446","full_name":"6ixGODD/inferflow","owner":"6ixGODD","description":"Universal Inference Pipeline Framework","archived":false,"fork":false,"pushed_at":"2025-12-13T08:04:40.000Z","size":318,"stargazers_count":1,"open_issues_count":8,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-12-14T22:14:47.140Z","etag":null,"topics":["computer-vision","deployment","torch","torchvision"],"latest_commit_sha":null,"homepage":"https://6ixgodd.github.io/inferflow/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/6ixGODD.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-04T06:37:16.000Z","updated_at":"2025-12-13T08:04:38.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/6ixGODD/inferflow","commit_stats":null,"previous_names":["6ixgodd/inferflow"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/6ixGODD/inferflow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/6ixGODD%2Finferflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/6ixGODD%2Finferflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/6ixGODD%2Finferflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/6ixGODD%2Finferflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/6ixGODD","download_url":"https://codeload.github.com/6ixGODD/inferflow/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/6ixGODD%2Finferflow/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28391153,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-13T14:36:09.778Z","status":"ssl_error","status_checked_at":"2026-01-13T14:35:19.697Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","deployment","torch","torchvision"],"created_at":"2026-01-13T16:01:51.806Z","updated_at":"2026-01-13T16:01:53.129Z","avatar_url":"https://github.com/6ixGODD.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n\u003cimg src=\"inferflow.png\" alt=\"Inferflow\" width=\"400\"/\u003e\n\n**Universal Inference Pipeline Framework**\n\n[![PyPI version](https://badge.fury.io/py/inferflow.svg)](https://pypi.org/project/inferflow/)\n[![Python](https://img.shields.io/pypi/pyversions/inferflow.svg)](https://pypi.org/project/inferflow/)\n[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)\n[![Documentation](https://img.shields.io/badge/docs-latest-brightgreen.svg)](https://6ixgodd.github.io/inferflow/)\n\n\u003c/div\u003e\n\n---\n\n## Overview\n\nInferFlow is a production-grade inference pipeline framework designed for computer vision models. It provides a clean\nabstraction layer that separates model runtime, preprocessing, postprocessing, and batching strategies, enabling\nseamless deployment across multiple inference backends.\n\n**Key Features:**\n\n- 🚀 **Multi-Backend Support**: TorchScript, ONNX Runtime, TensorRT\n- ⚡ **Dynamic Batching**: Automatic request batching with adaptive sizing\n- 🎯 **Type Safe**: Full type hints with generic pipeline definitions\n- 🔄 **Async \u0026 Sync**: Both synchronous and asynchronous APIs\n- 📊 **Production Ready**: Comprehensive logging, metrics, and error handling\n- 🧩 **Modular Design**:  Namespace-isolated pipelines for Torch and ONNX\n\n---\n\n## Installation\n\n### Quick Install (Pure Python)\n\n```bash\npip install inferflow\n```\n\n### Backend-Specific Installation\n\n```bash\n# PyTorch backend\npip install inferflow[torch]\n\n# ONNX Runtime backend\npip install inferflow[onnx]\n\n# TensorRT backend (Linux only)\npip install inferflow[tensorrt]\n\n# All backends\npip install inferflow[all]\n```\n\n### Development Installation (with C++ optimizations)\n\n```bash\ngit clone https://github.com/6ixGODD/inferflow.git\ncd inferflow\n\n# Install with C++ extensions for faster NMS\nINFERFLOW_BUILD_CPP=1 pip install -e \".[dev]\"\n\n# With CUDA support\nINFERFLOW_BUILD_CPP=1 INFERFLOW_CUDA=1 pip install -e \".[dev]\"\n```\n\n**Build Options:**\n\n| Variable              | Default | Description                           |\n|-----------------------|---------|---------------------------------------|\n| `INFERFLOW_BUILD_CPP` | `0`     | Enable C++ extensions                 |\n| `INFERFLOW_CUDA`      | `0`     | Enable CUDA support in C++ extensions |\n\n---\n\n## Quick Start\n\n### Synchronous API\n\n#### Basic Classification (PyTorch)\n\n```python\nfrom inferflow.runtime.torch import TorchScriptRuntime\nfrom inferflow.pipeline.classification.torch import ClassificationPipeline\n\n# Setup runtime\nruntime = TorchScriptRuntime(\n    model_path=\"resnet50.pt\",\n    device=\"cuda:0\",\n)\n\n# Create pipeline\npipeline = ClassificationPipeline(\n    runtime=runtime,\n    class_names={0: \"cat\", 1: \"dog\", 2: \"bird\"},\n)\n\n# Run inference\nwith pipeline.serve():\n    with open(\"image.jpg\", \"rb\") as f:\n        result = pipeline(f.read())\n\n    print(f\"{result.class_name}: {result.confidence:.2%}\")\n```\n\n#### Object Detection (ONNX)\n\n```python\nfrom inferflow.runtime.onnx import ONNXRuntime\nfrom inferflow.pipeline.detection.onnx import YOLOv5DetectionPipeline\n\n# Setup ONNX runtime\nruntime = ONNXRuntime(\n    model_path=\"yolov5s.onnx\",\n    device=\"cpu\",\n    precision=Precision.FP32,\n)\n\n# Create detection pipeline\npipeline = YOLOv5DetectionPipeline(\n    runtime=runtime,\n    conf_threshold=0.5,\n    class_names={0: \"person\", 1: \"car\", 2: \"dog\"},\n)\n\nwith pipeline.serve():\n    detections = pipeline(image_bytes)\n    for det in detections:\n        print(f\"{det.class_name}: {det.confidence:.2%} at {det.box}\")\n```\n\n---\n\n### Asynchronous API\n\n#### Classification (Async + PyTorch)\n\n```python\nimport asyncio\nfrom inferflow.asyncio.runtime.torch import TorchScriptRuntime\nfrom inferflow.asyncio.pipeline.classification.torch import ClassificationPipeline\n\n\nasync def main():\n    runtime = TorchScriptRuntime(\n        model_path=\"resnet50.pt\",\n        device=\"cuda:0\",\n    )\n\n    pipeline = ClassificationPipeline(\n        runtime=runtime,\n        class_names={0: \"cat\", 1: \"dog\"},\n    )\n\n    async with pipeline.serve():\n        with open(\"image.jpg\", \"rb\") as f:\n            result = await pipeline(f.read())\n\n        print(f\"{result.class_name}: {result.confidence:.2%}\")\n\n\nasyncio.run(main())\n```\n\n#### Instance Segmentation (Async + ONNX)\n\n```python\nimport asyncio\nfrom inferflow.asyncio.runtime.onnx import ONNXRuntime\nfrom inferflow.asyncio.pipeline.segmentation.onnx import YOLOv5SegmentationPipeline\n\n\nasync def main():\n    runtime = ONNXRuntime(\n        model_path=\"yolov5s-seg.onnx\",\n        device=\"cpu\",\n    )\n\n    pipeline = YOLOv5SegmentationPipeline(\n        runtime=runtime,\n        conf_threshold=0.5,\n        class_names={0: \"person\"},\n    )\n\n    async with pipeline.serve():\n        segments = await pipeline(image_bytes)\n        for seg in segments:\n            print(f\"Mask:  {seg.mask.shape}, Box: {seg.box}\")\n\n\nasyncio.run(main())\n```\n\n---\n\n## Dynamic Batching\n\nEnable automatic request batching for higher throughput (GPU recommended):\n\n```python\nimport asyncio\nfrom inferflow.asyncio.batch.dynamic import DynamicBatchStrategy\nfrom inferflow.asyncio.pipeline.classification.torch import ClassificationPipeline\n\n\nasync def main():\n    # Configure batching strategy\n    batch_strategy = DynamicBatchStrategy(\n        min_batch_size=1,\n        max_batch_size=32,\n        max_wait_ms=50,\n        queue_size=1000,\n    )\n\n    pipeline = ClassificationPipeline(\n        runtime=runtime,\n        batch_strategy=batch_strategy,\n    )\n\n    async with pipeline.serve():\n        # Submit concurrent requests - automatically batched\n        results = await asyncio.gather(\n            *[\n                pipeline(img) for img in images\n            ]\n            )\n\n    # View metrics\n    metrics = batch_strategy.get_metrics()\n    print(f\"Avg batch size:  {metrics.avg_batch_size:.2f}\")\n    print(f\"Total batches: {metrics.total_batches}\")\n    print(f\"Throughput: {metrics.total_requests / elapsed:.2f} req/s\")\n\n\nasyncio.run(main())\n```\n\n**Performance Tips:**\n\n- **GPU**: 3-5x speedup with batching\n- **CPU**: Limited benefit, focus on peak shaving\n- **`max_wait_ms`**: Balance latency vs. batch size\n- **`max_batch_size`**: GPU memory limit\n\n---\n\n## Custom Workflows\n\nBuild multi-stage pipelines with conditional logic and parallel execution:\n\n```python\nfrom inferflow.asyncio.workflow import task, parallel, sequence, Workflow\nfrom dataclasses import dataclass\n\n\n@dataclass\nclass QCContext:\n    image: bytes\n    is_valid: bool = True\n    defects: list = None\n    quality_grade: str = None\n\n\n@task(name=\"validate_image\")\nasync def validate(ctx: QCContext) -\u003e QCContext:\n    # Image validation logic\n    ctx.is_valid = check_image_quality(ctx.image)\n    return ctx\n\n\n@task(\n    name=\"detect_defects\",\n    condition=lambda ctx: ctx.is_valid,\n)\nasync def detect(ctx: QCContext) -\u003e QCContext:\n    # Defect detection\n    ctx.defects = await detection_pipeline(ctx.image)\n    return ctx\n\n\n@task(name=\"classify_grade\")\nasync def classify(ctx: QCContext) -\u003e QCContext:\n    # Quality grading\n    ctx.quality_grade = \"A\" if not ctx.defects else \"B\"\n    return ctx\n\n\n# Build workflow\nworkflow = Workflow[QCContext](\n    validate,\n    detect,\n    parallel(\n        classify,\n        generate_report,\n    ),\n)\n\n# Execute\ncontext = QCContext(image=image_bytes)\nresult = await workflow.run(context)\nprint(f\"Grade: {result.quality_grade}\")\n```\n\n---\n\n## Architecture\n\n### Core Abstractions\n\n```mermaid\ngraph TB\n    subgraph Pipeline[\"Pipeline\"]\n        direction LR\n        Pre[Preprocess]\n        Runtime[Runtime]\n        Post[Postprocess]\n        Pre --\u003e|Tensor| Runtime\n        Runtime --\u003e|Output| Post\n    end\n\n    BatchStrategy[BatchStrategy]\n    Pre -.-\u003e|Batching| BatchStrategy\n    Runtime -.-\u003e|Batching| BatchStrategy\n    Post -.-\u003e|Batching| BatchStrategy\n    style Pipeline fill: #1a1a1a, stroke: #00d9ff, stroke-width: 3px\n    style Pre fill: #0d47a1, stroke: #42a5f5, stroke-width: 2px, color: #fff\n    style Runtime fill: #e65100, stroke: #ff9800, stroke-width: 2px, color: #fff\n    style Post fill: #6a1b9a, stroke: #ba68c8, stroke-width: 2px, color: #fff\n    style BatchStrategy fill: #1b5e20, stroke: #66bb6a, stroke-width: 2px, color: #fff\n```\n\n**Codebase Structure:**\n\n```\ninferflow/\n├── runtime/\n│   ├── torch.py         # PyTorch runtime\n│   ├── onnx.py          # ONNX runtime\n│   └── tensorrt.py      # TensorRT runtime\n│\n├── pipeline/\n│   ├── classification/\n│   │   ├── torch.py     # Torch classification\n│   │   └── onnx.py      # ONNX classification\n│   ├── detection/\n│   │   ├── torch.py     # Torch YOLOv5 detection\n│   │   └── onnx.py      # ONNX YOLOv5 detection\n│   └── segmentation/\n│       ├── torch.py     # Torch YOLOv5 segmentation\n│       └── onnx.py      # ONNX YOLOv5 segmentation\n│\n└── asyncio/             # Async versions (same structure)\n```\n\n---\n\n## Examples\n\nCheck out the [examples/](examples/) directory for complete working examples:\n\n- **[01_classification](examples/01_classification/)** - Image classification with ResNet\n- **[02_detection](examples/02_detection/)** - YOLOv5 object detection\n- **[03_segmentation](examples/03_segmentation/)** - YOLOv5 instance segmentation\n- **[04_batch_processing](examples/04_batch_processing/)** - Dynamic batching benchmark\n- **[05_custom_workflow](examples/05_custom_workflow/)** - Multi-stage QC pipeline\n\n---\n\n## Requirements\n\n- Python ≥ 3.10\n- PyTorch ≥ 2.0 (for torch backend)\n- ONNX Runtime ≥ 1.15 (for onnx backend)\n- TensorRT ≥ 8.6 (for tensorrt backend)\n- OpenCV ≥ 4.5\n- NumPy ≥ 1.23\n\n---\n\n## Contributing\n\nContributions are not currently accepted.This project is maintained for internal use.\n\n---\n\n## License\n\nMIT License.See [LICENSE](LICENSE) for details.\n\n---\n\n## Citation\n\n```bibtex\n@software{inferflow2025,\n  title={InferFlow: Universal Inference Pipeline Framework},\n  author={6ixGODD},\n  year={2025},\n  url={https://github.com/6ixGODD/inferflow}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F6ixgodd%2Finferflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F6ixgodd%2Finferflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F6ixgodd%2Finferflow/lists"}