{"id":16891775,"url":"https://github.com/yzhao062/metaod","last_synced_at":"2025-04-09T21:17:19.323Z","repository":{"id":43744421,"uuid":"297700240","full_name":"yzhao062/MetaOD","owner":"yzhao062","description":"Automating Outlier Detection via Meta-Learning (Code, API, and Contribution Instructions)","archived":false,"fork":false,"pushed_at":"2022-02-09T04:58:39.000Z","size":37192,"stargazers_count":174,"open_issues_count":7,"forks_count":28,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-04-09T21:17:11.263Z","etag":null,"topics":["python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yzhao062.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-09-22T15:59:50.000Z","updated_at":"2025-03-24T05:42:14.000Z","dependencies_parsed_at":"2022-07-14T00:20:40.475Z","dependency_job_id":null,"html_url":"https://github.com/yzhao062/MetaOD","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yzhao062%2FMetaOD","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yzhao062%2FMetaOD/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yzhao062%2FMetaOD/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yzhao062%2FMetaOD/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yzhao062","download_url":"https://codeload.github.com/yzhao062/MetaOD/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248111973,"owners_count":21049578,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["python"],"created_at":"2024-10-13T17:08:37.353Z","updated_at":"2025-04-09T21:17:19.301Z","avatar_url":"https://github.com/yzhao062.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Automating Outlier Detection via Meta-Learning (MetaOD)\n=====================================================================\n\n\n.. image:: https://img.shields.io/pypi/v/metaod.svg?color=brightgreen\n   :target: https://pypi.org/project/metaod/\n   :alt: PyPI version\n\n.. image:: https://img.shields.io/github/stars/yzhao062/metaod.svg\n   :target: https://github.com/yzhao062/metaod/stargazers\n   :alt: GitHub stars\n\n.. image:: https://img.shields.io/github/forks/yzhao062/metaod.svg?color=blue\n   :target: https://github.com/yzhao062/metaod/network\n   :alt: GitHub forks\n\n.. image:: https://circleci.com/gh/yzhao062/MetaOD.svg?style=svg\n   :target: https://circleci.com/gh/yzhao062/MetaOD\n   :alt: Circle CI\n\n.. image:: https://travis-ci.org/yzhao062/MetaOD.svg?branch=master\n    :target: https://travis-ci.org/yzhao062/MetaOD\n\n----\n\n**Development Status**: **As of 09/26/2020, MetaOD is under active development and in its alpha stage. Please follow, star, and fork to get the latest update**!\nFor paper reproducibility, please see the paper_reproducibility folder for instruction.\n\n**Given an unsupervised outlier detection (OD) task on a new dataset, how can we automatically select a good outlier detection method and its hyperparameter(s) (collectively called a model)?**\nThus far, model selection for OD has been a \"black art\"; as any model evaluation is infeasible due to the lack of (i) hold-out data with labels, and (ii) a universal objective function.\nIn this work, we develop the first principled data-driven approach to model selection for OD, called MetaOD, based on meta-learning.\nIn short, MetaOD is trained on extensive OD benchmark datasets to capitalize the prior experience so that **it could select the potentially best performing model for unseen datasets**.\n\nUsing MetaOD is easy.\n**You could pass in a dataset, and MetaOD will return the most performing outlier detection models for it**, which boosts both detection quality and reduces the cost of running multiple models.\n\n\n**API Demo for selecting outlier detection model on a new dataset (within 3 lines)**\\ :\n\n\n.. code-block:: python\n\n   from metaod.models.utility import prepare_trained_model\n   from metaod.models.predict_metaod import select_model\n\n   # load pretrained MetaOD model\n   prepare_trained_model()\n\n   # use MetaOD to recommend models. It returns the top n model for new data X_train\n   selected_models = select_model(X_train, n_selection=100)\n\n\n\n`Preprint paper \u003chttps://arxiv.org/abs/2009.10606\u003e`_ | `Reproducibility instruction \u003chttps://github.com/yzhao062/MetaOD/tree/master/paper_reproducibility\u003e`_\n\n**Citing MetaOD**\\ :\n\nIf you use MetaOD in a scientific publication, we would appreciate\ncitations to the following paper::\n\n    @article{zhao2020automating,\n      author  = {Zhao, Yue and Ryan Rossi and Leman Akoglu},\n      title   = {Automating Outlier Detection via Meta-Learning},\n      journal = {arXiv preprint arXiv:2009.10606},\n      year    = {2020},\n    }\n\nor::\n\n    Zhao, Y., Rossi, R., and Akoglu, L., 2020. Automating Outlier Detection via Meta-Learning. arXiv preprint arXiv:2009.10606.\n    \n    \n**Table of Contents**\\ :\n\n\n* `Installation \u003c#installation\u003e`_\n* `API Cheatsheet \u0026 Reference \u003c#api-cheatsheet--reference\u003e`_\n* `Quick Start for Model Selection \u003c#quick-start-for-model-selection\u003e`_\n* `Quick Start for Meta Feature Generation \u003c#quick-start-for-meta-feature-generation\u003e`_\n\n\n------------\n\nSystem Introduction\n^^^^^^^^^^^^^^^^^^^\n\nAs shown in the figure below, MetaOD contains offline meta-learner training and online model selection.\nFor selecting an outlier detection model for a new dataset, one only needs the online model selection. Specifically, to be finished.\n\n\n.. image:: https://raw.githubusercontent.com/yzhao062/MetaOD/master/docs/images/MetaOD_Flowchart.jpg\n   :target: https://raw.githubusercontent.com/yzhao062/MetaOD/master/docs/images/MetaOD_Flowchart.jpg\n   :alt: metaod_flow\n   :align: center\n\n-----\n\n\nInstallation\n^^^^^^^^^^^^\n\nIt is recommended to use **pip** for installation. Please make sure\n**the latest version** is installed, as MetaOD is updated frequently:\n\n.. code-block:: bash\n\n   pip install metaod            # normal install\n   pip install --upgrade metaod  # or update if needed\n   pip install --pre metaod      # or include pre-release version for new features\n\nAlternatively, you could clone and run setup.py file:\n\n.. code-block:: bash\n\n   git clone https://github.com/yzhao062/metaod.git\n   cd metaod\n   pip install .\n  \n  \n**Required Dependencies**\\ :\n\n\n* Python 3.5, 3.6, or 3.7\n* joblib\u003e=0.14.1\n* liac-arff\n* numpy\u003e=1.18.1\n* scipy\u003e=0.20\n* **scikit_learn==0.22.1**\n* pandas\u003e=0.20\n* pyod\u003e=0.8\n\n**Note**: Since we need to load trained models, we fix the scikit-learn version\nto 0.20. We recommend you to use MetaOD in a fully fresh env to have the right dependency.\n\n\nQuick Start for Model Selection\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n`\"examples/model_selection_example.py\" \u003chttps://github.com/yzhao062/MetaOD/blob/master/examples/model_selection_example.py\u003e`_\nprovide an example on using MetaOD for selecting top models on a new datasets, which is fully unsupervised.\n\nThe key procedures are below:\n\n#. Load some synthetic datasets\n\n   .. code-block:: python\n\n    # Generate sample data\n    X_train, y_train, X_test, y_test = \\\n        generate_data(n_train=1000,\n                      n_test=100,\n                      n_features=3,\n                      contamination=0.5,\n                      random_state=42)\n\n#. Use MetaOD to select top 100 models\n\n   .. code-block:: python\n\n    from metaod.models.utility import prepare_trained_model\n    from metaod.models.predict_metaod import select_model\n\n    # load pretrained models\n    prepare_trained_model()\n\n    # recommended models. this returns the top model for X_train\n    selected_models = select_model(X_train, n_selection=100)\n\n\n#. Show the selected models' performance evaluation (result may vary slightly due to built-in randomness).\n\n   .. code-block:: python\n\n\n    1st model Average Precision 0.9780551579734139\n    10th model Average Precision 0.959749602397687\n    50th model Average Precision 0.6211392467111937\n\n\nQuick Start for Meta Feature Generation\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nGetting the embedding of an arbitrary dataset is first step of MetaOD, which\ncam be done by our specialized meta-feature generation function.\n\nIt may be used for other purposes as well, e.g., measuring the similarity of\ntwo datasets.\n\n.. code-block:: python\n\n    # import meta-feature generator\n    from metaod.models.gen_meta_features import gen_meta_features\n\n    meta_features, _ = generate_meta_features(X)\n\nA simple example of visualizing two different environments using TSNE with\nour meta-features are shown below. The environment on the left is composed\n100 datasets with similarity, and the same color stands for same group of datasets.\nThe environment on the left is composed\n62 datasets without known similarity. Our meta-features successfully capture\nthe underlying similarity in the left figure.\n\n.. image:: https://raw.githubusercontent.com/yzhao062/MetaOD/master/docs/images/meta_vis.jpg\n   :target: https://raw.githubusercontent.com/yzhao062/MetaOD/master/docs/images/meta_vis.jpg\n   :alt: meta_viz\n   :align: center\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyzhao062%2Fmetaod","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyzhao062%2Fmetaod","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyzhao062%2Fmetaod/lists"}