{"id":14977851,"url":"https://github.com/fabricioarendtorres/streamauc","last_synced_at":"2026-01-06T22:35:20.292Z","repository":{"id":247968646,"uuid":"827371984","full_name":"FabricioArendTorres/streamAUC","owner":"FabricioArendTorres","description":"Light-weight package for classification metrics computed on streams or minibatches of data. Mainly for area under the curve (AUC) of precision-recall (PR) or receiver operating characteristic (ROC) curves. Supports multi-class setting with either macro- or micro aggregation..","archived":false,"fork":false,"pushed_at":"2024-07-22T13:35:32.000Z","size":156,"stargazers_count":0,"open_issues_count":5,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-13T06:40:00.991Z","etag":null,"topics":["classification-model","machine-learning","metrics","numpy","precision-recall-curve","receiver-operating-characteristic"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FabricioArendTorres.png","metadata":{"files":{"readme":"README.md","changelog":"HISTORY.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["rochacbruno"],"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"custom":null}},"created_at":"2024-07-11T14:23:01.000Z","updated_at":"2024-07-22T13:35:36.000Z","dependencies_parsed_at":"2025-02-13T06:43:29.085Z","dependency_job_id":null,"html_url":"https://github.com/FabricioArendTorres/streamAUC","commit_stats":null,"previous_names":["fabricioarendtorres/streamauc"],"tags_count":6,"template":false,"template_full_name":"rochacbruno/python-project-template","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FabricioArendTorres%2FstreamAUC","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FabricioArendTorres%2FstreamAUC/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FabricioArendTorres%2FstreamAUC/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FabricioArendTorres%2FstreamAUC/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FabricioArendTorres","download_url":"https://codeload.github.com/FabricioArendTorres/streamAUC/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245805964,"owners_count":20675291,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification-model","machine-learning","metrics","numpy","precision-recall-curve","receiver-operating-characteristic"],"created_at":"2024-09-24T13:56:26.410Z","updated_at":"2026-01-06T22:35:20.239Z","avatar_url":"https://github.com/FabricioArendTorres.png","language":"Python","funding_links":["https://github.com/sponsors/rochacbruno"],"categories":[],"sub_categories":[],"readme":"# streamauc\n\n[![codecov](https://codecov.io/gh/FabricioArendTorres/streamAUC/branch/main/graph/badge.svg?token=streamAUC_token_here)](https://codecov.io/gh/FabricioArendTorres/streamAUC)\n[![CI](https://github.com/FabricioArendTorres/streamAUC/actions/workflows/main.yml/badge.svg)](https://github.com/FabricioArendTorres/streamAUC/actions/workflows/main.yml)\n\n## [Documentation](https://fabricioarendtorres.github.io/streamAUC/)\n\n## Multi-Class Classification Metrics from data streams and minibatches\n\nA low dependency python package for keeping track of classification metrics \nsuch as AUC given probabilistic outputs.\n\nIn essence, the package keeps track of one-vs-all confusion matrices for each \nclass for a range of thresholds. \nThis allows a minibatch based updating of the things such as ROC or \nPrecision-Recall curves, without having to store all the predictions.\nMetrics can then be computed either in a one-vs-all fashion, or by micro- \nor macro averaging.\n\nMy main usage is for multiclass semantic segmentation, where the train and \ntest data becomes rather large for pixel-wise performance metrics.\n\nThis package supports a range of classical performance metrics, such as:\n- TPR, FNR, FPR, TNR, Accuracy, F1-Score, Jaccard Index, ...\n- Corresponding curves, such as Precision-Recall (PR) curves or ROC curves.\n- AUC of ROC and PR curves, or any combination of two metrics you want.\n- One-vs-all, micro, or macro averaging of metrics for a set of predefined \n  thresholds.\n\n## Lightweight, tested, and permissive License\n\n- Only Numpy, Numba, and Matplotlib are requirements.\n- High Test Coverage: Metrics are unit tested against sklearn metrics.\n- Permissive License: Licensed under Apache 2.0.\n\n## Installation\n\n### Pypi Current Release\n```bash\npip install streamauc\n```\n\n### Latest Version from Github\n```bash\npip install git+https://github.com/FabricioArendTorres/streamAUC.git\n```\n\n\n## Usage\nBelow you can find pseudocode for the usage of this package.\nFor a more comprehensive and self-consistent example, see `examples/example.py`.\n\n### Keep track of confusion matrices at many thresolds\n\n```py\nimport numpy as np\n\nfrom streamauc import StreamingMetrics, AggregationMethod\n\n# Select the number of thresholds for which we want to keep track of results.\nstream_metrics = StreamingMetrics(\n  thresholds=np.linspace(0, 1, 200),\n  num_classes=10,\n)\n\nwhile youhavedata:\n  y_true = ...  # true classes, shape (-1,) or one-hot-encoded (-1,num_classes)\n  pred_prob_y = ...  # indicating class probabilities,  shape (-1, num_classes), \n  stream_metrics.update(y_true=y_true, y_score=pred_prob_y)\n\n## get 1-vs-all confusion matrix at all thresholds\nconfm = stream_metrics.confusion_matrix \n# confm is of shape (num_thresholds, num_classes, 2, 2)\n\n## get metrics at all thresholds\ntp = stream_metrics.true_positives()  # is of shape (num_threholds, num_classes)\n\nfpr, tpr, thresholds = stream_metrics.roc_curve(\n  AggregationMethod.ONE_VS_ALL) # fpr and tpr are of shape (num_thresholds, num_classes)\n\n\nfpr, tpr, thresholds = stream_metrics.precision_recall_curve(\n  AggregationMethod.MACRO) # fpr and tpr are of shape (num_thresholds, )\n\n# reset before updating with new data\nstream_metrics.reset()\n\n```\n\n### Track metrics in a minibatch based training loop\n```py\nimport matplotlib.pyplot as plt\n\nfrom streamauc import StreamingMetrics, AggregationMethod, auc\nfrom streamauc import metrics\n\n# Select the number of thresholds for which we want to keep track of results.\nstream_metrics = StreamingMetrics(\n  num_thresholds=100,\n  num_classes=3,\n)\n\n# Whatever your model may be, you need probabilities for the \n# defined number of classes.\nmodel = ...\nyourdataiterator = ...\n\nfor epoch in range(100):\n  ... # do your training step\n  \n  for mb_x, mb_y in yourdataiterator:\n    pred_prob_y = model.predict_proba(mb_x) # of shape (-1, num_classes)\n    # mb_y can be onehot encoded (-1, num_classes) or a flat integer array (-1,)\n    stream_metrics.update(y_true=mb_y, y_score=pred_prob_y)\n  \n  # compute metrics you want\n  _auc_macro = stream_metrics.auc(metrics.recall,\n                                  metrics.precision,\n                                  method=AggregationMethod.MACRO)\n  f1_for_all_thresholds = stream_metrics.calc_metric(metric=metrics.f1_score)\n  \n  # Plot all 1-vs-all/micro-averaged/macro-averaged Precision Recall Curves\n  fig = stream_metrics.precision_recall_curve(method=AggregationMethod.\n                                              ONE_VS_ALL)\n  fig.savefig(f\"PR_one_vs_all_{epoch}.png\")\n  plt.close(fig)\n  \n  # reset the tracker for the next epoch \n  stream_metrics.reset()\n```\n\n\n\n\n## Things to note\n\n### Curves and AUC are only approximate\nStreamAUC works by keeping track of confusion matrices at different \nthresholds, which are defined at the beginning. That is, the resulting \ncurves and AUC are by construction always approximations.\n\nThis should however not be too limiting for any application with large data \nsets, as in that case the number of unique thresholds becomes infeasible in \nany case.\n\n### Precision-Recall Curve: Definition of precision when recall is zero\nThere are different conventions regarding the precision when there are no \npositive predictions, which occurs at the left-most point of the \nprecision-recall curve corresponding to a threshold of 1. \nTechnically, its undefined, since we have TP/(TP+FP)=0/0. \nScikit-learn then defines it as 1, for the sake of nicer PR curves.\nThis package defines it as 0, as a value of 1 seems misleading in my opinion.\n\n## Custom Metrics\nIt's straight-forward to add custom metrics to this package, just define a \nfunction with the following interface, which can then be passed as Callable to \n`StreamingMetrics.calc_metric`,  `StreamingMetrics.auc`.\nThe basic metrics (TP, FN, FP, TN) are always in the shape of `\n(num_thresholds, num_classes)`, with e.g. `TP[:,2]` corresponding to the \nnumber of true positives at each threshold in a one-vs-all setting for the \nclass with index 2.\n\nSee for example the F1 metric implementation for the required interface:\n```python\nfrom typing import Optional\nimport numpy as np\n\nfrom streamauc.utils import AggregationMethod, check_confusion_matrix_entries\n\ndef custom_f1_score(\n    tp: np.ndarray,\n    fn: np.ndarray,\n    fp: np.ndarray,\n    tn: np.ndarray,\n    method: AggregationMethod = AggregationMethod.MACRO,\n    class_index: Optional[int] = None,\n    check_inputs: bool = True,\n):  \n    \n    if check_inputs:\n        # do some optional checks for valid shapes etc.\n        check_confusion_matrix_entries(tp, fn, fp, tn)\n        \n    if method == AggregationMethod.MICRO:\n        tp_sum = np.sum(tp, axis=-1)\n        fn_sum = np.sum(fn, axis=-1)\n        fp_sum = np.sum(fp, axis=-1)\n        _f1 = ((2 * tp_sum) / (2 * tp_sum + fp_sum + fn_sum + 1e-12))\n        _f1 = _f1[..., class_index]\n    elif method == AggregationMethod.MACRO:\n        _f1 = ((2 * tp) / (2 * tp + fp + fn + 1e-12)).mean(-1)\n    elif method == AggregationMethod.ONE_VS_ALL:\n        _f1 = ((2 * tp) / (2 * tp + fp + fn + 1e-12))[..., class_index]\n    else:\n        raise ValueError(\n            f\"Method must one of {[e.value for e in AggregationMethod]}. \"\n            f\"Got {method}.\"\n        )\n    return _f1\n```\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffabricioarendtorres%2Fstreamauc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffabricioarendtorres%2Fstreamauc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffabricioarendtorres%2Fstreamauc/lists"}