{"id":49405650,"url":"https://github.com/whoigit/ifcb-inference","last_synced_at":"2026-04-28T21:03:36.652Z","repository":{"id":297492877,"uuid":"995629888","full_name":"WHOIGit/ifcb-inference","owner":"WHOIGit","description":"ONNX-based inference for plankton classification on IFCB data","archived":false,"fork":false,"pushed_at":"2026-04-21T21:22:03.000Z","size":151,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-21T21:32:17.579Z","etag":null,"topics":["amplify-whoi"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WHOIGit.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-06-03T19:26:09.000Z","updated_at":"2026-04-21T21:22:06.000Z","dependencies_parsed_at":"2025-08-06T15:10:44.329Z","dependency_job_id":"b12dcb10-6ee6-4027-ac62-158d0b5ab73d","html_url":"https://github.com/WHOIGit/ifcb-inference","commit_stats":null,"previous_names":["whoigit/amplify_onnx_inference","whoigit/ifcb-inference"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/WHOIGit/ifcb-inference","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WHOIGit%2Fifcb-inference","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WHOIGit%2Fifcb-inference/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WHOIGit%2Fifcb-inference/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WHOIGit%2Fifcb-inference/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WHOIGit","download_url":"https://codeload.github.com/WHOIGit/ifcb-inference/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WHOIGit%2Fifcb-inference/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32399027,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-28T19:38:08.556Z","status":"ssl_error","status_checked_at":"2026-04-28T19:37:55.688Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amplify-whoi"],"created_at":"2026-04-28T21:03:35.867Z","updated_at":"2026-04-28T21:03:36.647Z","avatar_url":"https://github.com/WHOIGit.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ifcb-inference\n\n![Tests](https://github.com/WHOIGit/ifcb-inference/workflows/Tests/badge.svg)\n![Lint](https://github.com/WHOIGit/ifcb-inference/workflows/Lint/badge.svg)\n\nONNX-based inference system for IFCB (Imaging FlowCytobot) bin data. This tool performs automated plankton classification on IFCB bin files using pre-trained ONNX models.\n\n## Features\n\n- **Flexible model support**: Works with both static and dynamic batch size ONNX models\n- **Multiple data loading backends**: Supports both PyTorch and non-PyTorch data loading\n- **Configurable output organization**: Choose between run-date or model-name subfolder organization\n- **Directory structure preservation**: Maintains input directory hierarchies in output\n- **Containerized deployment**: Docker/Podman support for consistent environments\n- **GPU acceleration**: CUDA support for faster inference (automatic when available)\n\n## Installation\n\n| Extra | Installs | Use when |\n|---|---|---|\n| `[cpu]` | `onnxruntime` (CPU) | Lightweight/constrained environments — no GPU |\n| `[cuda]` | `onnxruntime-gpu` | GPU inference via CUDA |\n| `[torch]` | PyTorch + torchvision | Faster/more flexible data loading, but more dependancies |\n| `[cuda,torch]` | Both of the above | Full-featured install |\n| `[dev]` | pytest, black, isort, flake8 | Development and testing |\n\n- One of `[cpu]` or `[cuda]` must be used to have the appropriate onnxruntime. They are mutually exclusive. If neither are included, at install, `ifcb-infer` will be unable to run. If in doubt, use `[cuda]`.\n- Use of `[torch]` is optional. Without it, a basic data loader is used — suitable for constrained or lite environments where installing PyTorch is impractical (e.g. small containers, edge deployments). The `[torch]` data loader is recommended otherwise as it supports more image formats and is generally faster.\n\n```bash\n# Full featured install\npip install \"ifcb-infer[cuda,torch] @ git+https://github.com/WHOIGit/ifcb-inference.git\"\n\n# GPU enabled, but without pytorch dependencies\npip install \"ifcb-infer[cuda] @ git+https://github.com/WHOIGit/ifcb-inference.git\"\nexport LD_LIBRARY_PATH=$(pip show nvidia-cudnn-cu12 | grep Location | awk '{print $2}')/nvidia/cudnn/lib:$LD_LIBRARY_PATH\n# see \"cuDNN requirement for `[cuda]` without `[torch]`\" LD_LIBRARY_PATH note below\n\n# Lightest install\npip install \"ifcb-infer[cpu] @ git+https://github.com/WHOIGit/ifcb-inference.git\"\n```\n\nIf cloning the repo and developing locally:\n```bash\n# Full-featured install (gpu/CUDA + PyTorch)\npip install -e \".[cuda,torch,dev]\"\n```\n\n### cuDNN requirement for `[cuda]` without `[torch]`\n\n`[cuda,torch]` works out of the box — PyTorch bundles its own cuDNN libraries and ORT finds them automatically.\n\n`[cuda]` alone installs `nvidia-cudnn-cu12` via pip, but ORT cannot find it without help because the libraries land in `site-packages`, not a standard system path. If you don't have libcudnn9-cuda-12 installed globally/to a standard location, it must be explicitely set with `LD_LIBRARY_PATH`. \n\nSetting `LD_LIBRARY_PATH` to point to the pip-installed cuDNN:**\n```bash\nexport LD_LIBRARY_PATH=$(pip show nvidia-cudnn-cu12 | grep Location | awk '{print $2}')/nvidia/cudnn/lib:$LD_LIBRARY_PATH\n```\nAdd this to your environment profile (`.bashrc`, `.bash_profile`, venv/bin/activate script) to make it persistent.\n\n## Usage\n\n```bash\nifcb-infer [OPTIONS] MODEL BINS [BINS ...]\n```\n`MODEL` is the path to an onnx model file\n`BINS` can be a directory, a bin path, or a `.txt`/`.list` file of bin paths.\n\n### Options\n\n```\n--classes FILE                         Class list file; adds column headers to output CSVs.\n                                       Accepts a line-delimited .txt or an index-keyed .json\n                                       (e.g. {\"0\": \"class_a\", \"1\": \"class_b\"})\n--batch N                              Required for models without a fixed input batch size\n--outdir DIRPATH                       Output directory. Default: ./outputs\n--outfile PATTERN                      Output filename pattern. Default: {MODEL_NAME}/{SUBPATH}/{BIN}.csv\n                                       Tokens: {MODEL_NAME}, {RUN_DATE}, {SUBPATH} (relative dir), {BIN} (bin name)\n--cpuonly                              Force CPU inference even if CUDA is available\n--notorch                              Use non-PyTorch data loader even if torch is installed\n```\n\n- By default, CUDA is used automatically when available/installed and otherwise falls back to using CPU.\n- By default, torch-dataloaders are used automatically when available/installed and otherwise falls back to a simpler implementation.\n- For the output csv to have column names that correspond to human-readable class names, use `--classes` option.\n- If a model has a predefined input batch size, that batch size is automatically used and `--batch` is ignored. \n- If a model does NOT have a predefined input batch size, `--batch` must be specified.\n\n### Output Organization Examples\n\nThe output path for each bin is controlled by the `--outfile PATTERN` option (default: `{MODEL_NAME}/{SUBPATH}/{BIN}.csv`), resolved relative to `--outdir`. The available tokens are:\n\n| Token | Value |\n|---|---|\n| `{BIN}` | Bin name (e.g. `D20230108T145350_IFCB127`) |\n| `{SUBPATH}` | Directory of the bin relative to the input folder |\n| `{MODEL_NAME}` | Model filename without extension |\n| `{RUN_DATE}` | Date the command was run (`YYYY-MM-DD`) |\n\n`{SUBPATH}` mirrors the input directory hierarchy, so outputs reflect the same structure as the source data. Given:\n\n```\nexample-data/bins/\n├── MVCO/\n│   ├── 2006/\n│   │   └── IFCB1_2006_157/\n│   │       ├── IFCB1_2006_157_181359   ← bin\n│   │       ├── IFCB1_2006_157_183432   ← bin\n│   │       └── IFCB1_2006_157_185616   ← bin\n│   └── 2023/\n│       └── D20230108/\n│           ├── D20230108T145350_IFCB127   ← bin\n│           ├── D20230108T151529_IFCB127   ← bin\n│           └── D20230108T153615_IFCB127   ← bin\n└── OTZ/\n    └── 2019/\n        ├── D20190722/\n        │   └── D20190722T155753_IFCB127   ← bin\n        └── D20190723/\n            ├── D20190723T161602_IFCB127   ← bin\n            └── D20190723T171832_IFCB127   ← bin\n```\n\n**Default (`{MODEL_NAME}/{SUBPATH}/{BIN}.csv`):**\n```bash\nifcb-infer my_classifier.onnx example-data/bins/\n```\n```\noutputs/\n└── my_classifier/\n    ├── MVCO/2006/IFCB1_2006_157/IFCB1_2006_157_181359.csv\n    ├── MVCO/2006/IFCB1_2006_157/IFCB1_2006_157_183432.csv\n    ├── MVCO/2006/IFCB1_2006_157/IFCB1_2006_157_185616.csv\n    ├── MVCO/2023/D20230108/D20230108T145350_IFCB127.csv\n    ├── MVCO/2023/D20230108/D20230108T151529_IFCB127.csv\n    ├── MVCO/2023/D20230108/D20230108T153615_IFCB127.csv\n    ├── OTZ/2019/D20190722/D20190722T155753_IFCB127.csv\n    ├── OTZ/2019/D20190723/D20190723T161602_IFCB127.csv\n    └── OTZ/2019/D20190723/D20190723T171832_IFCB127.csv\n```\n\n**Flat output — one folder, all bins (`--outfile \"{BIN}.csv\"`):**\n```bash\nifcb-infer --outdir \"my/custom/output\" --outfile \"{BIN}.csv\" my_classifier.onnx example-data/bins/\n```\n```\nmy/custom/output/\n├── IFCB1_2006_157_181359.csv\n├── IFCB1_2006_157_183432.csv\n├── IFCB1_2006_157_185616.csv\n├── D20230108T145350_IFCB127.csv\n├── D20230108T151529_IFCB127.csv\n├── D20230108T153615_IFCB127.csv\n├── D20190722T155753_IFCB127.csv\n├── D20190723T161602_IFCB127.csv\n└── D20190723T171832_IFCB127.csv\n```\n\n**Run-date prefix (`--outfile \"{RUN_DATE}/{SUBPATH}/{BIN}.csv\"`):**\n```bash\nifcb-infer --outfile \"{RUN_DATE}/{SUBPATH}/{BIN}.csv\" my_classifier.onnx example-data/bins/\n```\n```\noutputs/\n└── 2025-01-15/\n    ├── MVCO/2006/IFCB1_2006_157/IFCB1_2006_157_181359.csv\n    ├── MVCO/2006/IFCB1_2006_157/IFCB1_2006_157_183432.csv\n    ├── MVCO/2006/IFCB1_2006_157/IFCB1_2006_157_185616.csv\n    ├── MVCO/2023/D20230108/D20230108T145350_IFCB127.csv\n    ├── MVCO/2023/D20230108/D20230108T151529_IFCB127.csv\n    ├── MVCO/2023/D20230108/D20230108T153615_IFCB127.csv\n    ├── OTZ/2019/D20190722/D20190722T155753_IFCB127.csv\n    ├── OTZ/2019/D20190723/D20190723T161602_IFCB127.csv\n    └── OTZ/2019/D20190723/D20190723T171832_IFCB127.csv\n```\n\n## Container Use\n\nThe Dockerfile installs with `[cuda,torch]` for full GPU support.\n\nBuild:\n```bash\n# Podman\npodman build . -t ifcb-infer:latest\npodman run -it --rm -e CUDA_VISIBLE_DEVICES=1 \\\n       --device nvidia.com/gpu=all \\\n       -v $(pwd)/models:/app/models \\\n       -v $(pwd)/inputs:/app/inputs \\\n       -v $(pwd)/outputs:/app/outputs \\\n       ifcb-infer:latest models/classifier.onnx inputs/\n```\n\nTo select a specific GPU, use `CUDA_VISIBLE_DEVICES`:\n\nAll `ifcb-infer` options can be appended after the image name. \n`MODEL` and `BINS` paths must refer to paths _inside_ the container as mapped by `-v`.\n\n\n## Development\n\n### Running Tests\n\nFirst install with the `[dev]` extra:\n\n```bash\npip install -e \".[dev]\"\n```\n\nThen run:\n\n```bash\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=src --cov-report=term-missing\n```\n\n### Continuous Integration\n\nThe project includes GitHub Actions workflows that automatically:\n\n- **Run tests** on Python 3.10, 3.11, and 3.12 when code is pushed or PRs are opened\n- **Check code quality** with linting tools (flake8, black, isort)\n\nTests run automatically on pushes to `main` branch and on all pull requests.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwhoigit%2Fifcb-inference","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwhoigit%2Fifcb-inference","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwhoigit%2Fifcb-inference/lists"}