{"id":19260479,"url":"https://github.com/mitre/menelaus","last_synced_at":"2026-03-16T11:32:22.431Z","repository":{"id":37560960,"uuid":"497004869","full_name":"mitre/menelaus","owner":"mitre","description":"Online and batch-based concept and data drift detection algorithms to monitor and maintain ML performance. ","archived":false,"fork":false,"pushed_at":"2023-12-27T22:00:55.000Z","size":63136,"stargazers_count":67,"open_issues_count":41,"forks_count":7,"subscribers_count":9,"default_branch":"dev","last_synced_at":"2025-05-27T14:06:34.652Z","etag":null,"topics":["concept-drift","data-drift","data-science","drift-detection","machine-learning","statistics"],"latest_commit_sha":null,"homepage":"https://menelaus.readthedocs.io/en/latest/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mitre.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-05-27T13:20:52.000Z","updated_at":"2025-02-08T06:11:02.000Z","dependencies_parsed_at":"2023-12-27T21:42:25.162Z","dependency_job_id":"537679c0-4dfe-442c-87e6-2a7073b4e1ab","html_url":"https://github.com/mitre/menelaus","commit_stats":{"total_commits":176,"total_committers":10,"mean_commits":17.6,"dds":0.5568181818181819,"last_synced_commit":"ddbfda2e29b4c724ebb395633f42063ac0f30ed4"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/mitre/menelaus","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitre%2Fmenelaus","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitre%2Fmenelaus/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitre%2Fmenelaus/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitre%2Fmenelaus/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mitre","download_url":"https://codeload.github.com/mitre/menelaus/tar.gz/refs/heads/dev","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitre%2Fmenelaus/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260994045,"owners_count":23094283,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["concept-drift","data-drift","data-science","drift-detection","machine-learning","statistics"],"created_at":"2024-11-09T19:21:14.931Z","updated_at":"2026-03-16T11:32:17.387Z","avatar_url":"https://github.com/mitre.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![tests](https://github.com/mitre/menelaus/actions/workflows/tests.yml/badge.svg)](https://github.com/mitre/menelaus/actions/workflows/tests.yml)\n[![Documentation Status](https://readthedocs.org/projects/menelaus/badge/?version=latest)](https://menelaus.readthedocs.io/en/latest/?badge=latest)\n[![examples](https://github.com/mitre/menelaus/actions/workflows/examples.yml/badge.svg?branch=main)](https://github.com/mitre/menelaus/actions/workflows/examples.yml)\n[![lint](https://github.com/mitre/menelaus/actions/workflows/format.yml/badge.svg)](https://github.com/mitre/menelaus/actions/workflows/format.yml)\n\n# Background\n\nMenelaus implements algorithms for drift detection in machine learning. Drift\ndetection is a branch of machine learning focused on the detection of unforeseen\nshifts in data. The relationships between variables in a dataset are rarely\nstatic and can be affected by changes in both internal and external factors,\ne.g. changes in data collection techniques, external protocols, and/or\npopulation demographics. Both undetected changes in data and undetected model\nunderperformance pose risks to the users thereof. The aim of this package is to\nenable monitoring of data and of model performance.\n\nThe algorithms contained within this package were identified through a\ncomprehensive literature survey. Menelaus\\' aim was to implement drift detection\nalgorithms that cover a range of statistical methodology. Of the algorithms\nidentified, all are able to identify when drift is occurring; some can highlight\nsuspicious regions of the data in which drift is more significant; and others\ncan also provide model retraining recommendations.\n\nMenelaus implements drift detectors for both streaming and batch data. In a\nstreaming setting, data is arriving continuously and is processed one\nobservation at a time. Streaming detectors process the data with each new\nobservation that arrives and are intended for use cases in which instant\nanalytical results are desired. In a batch setting, information is collected\nover a period of time. Once the predetermined set is \\\"filled\\\", data is fed\ninto and processed by the drift detection algorithm as a single batch. Within a\nbatch, there is no meaningful ordering of the data with respect to time. Batch\nalgorithms are typically used when it is more important to process large volumes\nof information simultaneously, where the speed of results after receiving data\nis of less concern.\n\nMenelaus is named for the Odyssean hero that defeated the shapeshifting Proteus.\n\n# Detector List\n\nMenelaus implements the following drift detectors.\n\n| Type             | Detector                                                      | Abbreviation | Streaming | Batch |\n|------------------|---------------------------------------------------------------|--------------|-----------|-------|\n| Change detection | Cumulative Sum Test                                           | CUSUM        | x         |       |\n| Change detection | Page-Hinkley                                                  | PH           | x         |       |\n| Change detection    | ADaptive WINdowing                                            | ADWIN        | x         |       |\n| Concept drift    | Drift Detection Method                                        | DDM          | x         |       |\n| Concept drift    | Early Drift Detection Method                                  | EDDM         | x         |       |\n| Concept drift    | Linear Four Rates                                             | LFR          | x         |       |\n| Concept drift    | Statistical Test of Equal Proportions to Detect concept drift | STEPD        | x         |       |\n| Concept drift    | Margin Density Drift Detection Method                         | MD3          | x         |       |\n| Data drift       | Confidence Distribution Batch Detection                       | CDBD         |           | x     |\n| Data drift       | Hellinger Distance Drift Detection Method                     | HDDDM        |           | x     |\n| Data drift       | kdq-Tree Detection Method                                     | kdq-Tree     | x         | x     |\n| Data drift       | PCA-Based Change Detection                                    | PCA-CD       | x         |       |\n| Data drift       | Nearest Neighbor Density Variation Identification             | NN-DVI       |           | x     |\n| Ensemble         | Streaming Ensemble      | - | x |\n| Ensemble         | Batch Ensemble          | - |   | x |\n\n\nThe three main types of detector are described below. More details, including\nreferences to the original papers, can be found in the respective module\ndocumentation on [ReadTheDocs](https://menelaus.readthedocs.io/en/latest/).\n\n-   Change detectors monitor single variables in the streaming context,\n    and alarm when that variable starts taking on values outside of a\n    pre-defined range.\n-   Concept drift detectors monitor the performance characteristics of a\n    given model, trying to identify shifts in the joint distribution of\n    the data\\'s feature values and their labels. Note that change detectors \n    can also be applied in this context.\n-   Data drift detectors monitor the distribution of the features; in\n    that sense, they are model-agnostic. Such changes in distribution\n    might be to single variables or to the joint distribution of all the\n    features.\n-   Ensembles are groups of detectors, where each watches the same data, and \n    drift is determined by combining their output. Menelaus implements a \n    framework for wrapping detectors this way.\n\nThe detectors may be applied in two settings, as described in the Background\nsection:\n\n-   Streaming, in which each new observation that arrives is processed\n    separately, as it arrives.\n-   Batch, in which the data has no meaningful ordering with respect to time,\n    and the goal is comparing two datasets as a whole.\n\nAdditionally, the library implements a kdq-Tree partitioner, for support of the\nkdq-Tree Detection Method. This data structure partitions a given feature space,\nthen maintains a count of the number of samples from the given dataset that fall\ninto each section of that partition. More details are given in the respective\nmodule.\n\n# Installation\n\nCreate a virtual environment as desired, then:\n\n```python\n# for read-only, install from pypi:\npip install menelaus\n\n# to allow editing, running tests, generating docs, etc.\n# first, clone the git repo, then:\ncd ./menelaus_clone_folder/\npip install -e .[dev] \n\n# to run examples which use datasets from the wilds library,\n# another install option is:\npip install menelaus[wilds]\n```\n\nMenelaus should work with Python 3.8 or higher. \n\n# Getting Started\n\nEach detector implements the API defined by `menelaus.detector`:\nnotably, they have an `update` method which allows new data to be passed, and a `drift_state` attribute which tells the user whether drift has been\ndetected, along with (usually) other attributes specific to the detector class.\n\nGenerally, the workflow for using a detector, given some data, is as\nfollows:\n\n```python\nfrom menelaus.concept_drift import ADWINAccuracy\nfrom menelaus.data_drift import KdqTreeStreaming\nfrom menelaus.datasets import fetch_rainfall_data\nfrom menelaus.ensemble import StreamingEnsemble, SimpleMajorityElection\n\n\n# has feature columns, and a binary response 'rain'\ndf = fetch_rainfall_data()\n\n\n# use a concept drift detector (response-only)\ndetector = ADWINAccuracy()\nfor i, row in df.iterrows():\n    detector.update(X=None, y_true=row['rain'], y_pred=0)\n    assert detector.drift_state != \"drift\", f\"Drift detected in row {i}\"\n\n\n# use data drift detector (features-only)\ndetector = KdqTreeStreaming(window_size=5)\nfor i, row in df.iterrows():\n    detector.update(X=df.loc[[i], df.columns != 'rain'], y_true=None, y_pred=None)\n    assert detector.drift_state != \"drift\", f\"Drift detected in row {i}\"\n\n\n# use ensemble detector (detectors + voting function)\nensemble = StreamingEnsemble(\n  {\n    'a': ADWINAccuracy(),\n    'k': KdqTreeStreaming(window_size=5)\n  },\n  SimpleMajorityElection()\n)\n\nfor i, row in df.iterrows():\n    ensemble.update(X=df.loc[[i], df.columns != 'rain'], y_true=row['rain'], y_pred=0)\n    assert ensemble.drift_state != \"drift\", f\"Drift detected in row {i}\"\n```\n\nAs a concept drift detector, ADWIN requires both a true value (`y_true`) and a\npredicted value (`y_predicted`) at each update step. The data drift detector\nKdqTreeStreaming only requires the feature values at each step (`X`). More\ndetailed examples, including code for visualizating drift locations, may be\nfound in the ``examples`` directory, as stand-alone python scripts. The examples\nalong with output can also be viewed on the RTD website.\n\n# Contributing\nInstall the library using the `[dev]` option, as above.\n\n- **Testing**\n\n  Unit tests can be run with the command `pytest`. By default, a\n  coverage report with highlighting will be generated in `htmlcov/index.html`.\n  These default settings are specified in `setup.cfg` under `[tool:pytest]`.\n\n- **Documentation**\n\n  HTML documentation can be generated at\n  `menelaus/docs/build/html/index.html` with:\n  ```python\n  cd docs/source\n  sphinx-build . ../build\n  ```\n\n  If the example notebooks for the docs need to be updated, the corresponding \n  python scripts in the `examples` directory should also be regenerated via:\n  ```python\n  cd docs/source/examples\n  python convert_notebooks.py\n  ```\n  Note that this will require the installation of `jupyter` and `nbconvert`,\n  which can be added to installation via `pip install -e \".[dev, test]\"`.\n\n- **Formatting**:\n\n  This project uses `black`, `bandit`, and `flake8` for code formatting and\n  linting, respectively. To satisfy these requirements when contributing, you\n  may use them as the linter/formatter in your IDE, or manually run the\n  following from the root directory:\n  ```python\n  flake8 ./menelaus           # linting\n  bandit -r ./menelaus        # security checks\n  black ./menelaus            # formatting\n  ```  \n\n# Copyright\n\nAuthors: Leigh Nicholl, Thomas Schill, India Lindsay, Anmol Srivastava, Kodie P McNamara, Shashank Jarmale.\\\n©2022 The MITRE Corporation. ALL RIGHTS RESERVED\\\nApproved for Public Release; Distribution Unlimited. Public Release\\\nCase Number 22-0244.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmitre%2Fmenelaus","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmitre%2Fmenelaus","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmitre%2Fmenelaus/lists"}