{"id":13415315,"url":"https://github.com/Stream-AD/MIDAS","last_synced_at":"2025-03-14T22:33:18.980Z","repository":{"id":40680855,"uuid":"216963360","full_name":"Stream-AD/MIDAS","owner":"Stream-AD","description":"Anomaly Detection on Dynamic (time-evolving) Graphs in Real-time and Streaming manner. Detecting intrusions (DoS and DDoS attacks), frauds, fake rating anomalies.","archived":false,"fork":false,"pushed_at":"2024-01-10T05:43:58.000Z","size":31533,"stargazers_count":749,"open_issues_count":2,"forks_count":92,"subscribers_count":29,"default_branch":"master","last_synced_at":"2024-04-14T20:45:23.569Z","etag":null,"topics":["aaai2020","anomaly-detection","denial-of-service","fraud-detection","intrusion-detection"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Stream-AD.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2019-10-23T03:50:50.000Z","updated_at":"2024-04-13T14:05:06.000Z","dependencies_parsed_at":"2024-01-12T22:18:26.402Z","dependency_job_id":"07a0173f-a63a-47cd-a48e-8e2e73e5274f","html_url":"https://github.com/Stream-AD/MIDAS","commit_stats":null,"previous_names":["bhatiasiddharth/midas"],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Stream-AD%2FMIDAS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Stream-AD%2FMIDAS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Stream-AD%2FMIDAS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Stream-AD%2FMIDAS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Stream-AD","download_url":"https://codeload.github.com/Stream-AD/MIDAS/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243658057,"owners_count":20326459,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aaai2020","anomaly-detection","denial-of-service","fraud-detection","intrusion-detection"],"created_at":"2024-07-30T21:00:47.022Z","updated_at":"2025-03-14T22:33:18.972Z","avatar_url":"https://github.com/Stream-AD.png","language":"C++","readme":"# MIDAS\n\n\u003cp\u003e\n  \u003ca href=\"https://aaai.org/Conferences/AAAI-20/\"\u003e\n    \u003cimg src=\"http://img.shields.io/badge/AAAI-2020-red.svg\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://arxiv.org/pdf/2009.08452.pdf\"\u003e\u003cimg src=\"http://img.shields.io/badge/Paper-PDF-brightgreen.svg\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://www.comp.nus.edu.sg/~sbhatia/assets/pdf/MIDAS_slides.pdf\"\u003e\n      \u003cimg src=\"http://img.shields.io/badge/Slides-PDF-ff9e18.svg\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://youtu.be/Bd4PyLCHrto\"\u003e\n    \u003cimg src=\"http://img.shields.io/badge/Talk-Youtube-ff69b4.svg\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://www.youtube.com/watch?v=DPmN-uPW8qU\"\u003e \n    \u003cimg src=\"https://img.shields.io/badge/Overview-Youtube-orange.svg\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/Stream-AD/MIDAS/blob/master/LICENSE\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/License-Apache%202.0-blue.svg\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\nC++ implementation of\n\n- [Real-time Streaming Anomaly Detection in Dynamic Graphs](https://arxiv.org/pdf/2009.08452.pdf). *Siddharth Bhatia, Rui Liu, Bryan Hooi, Minji Yoon, Kijung Shin, Christos Faloutsos*. TKDD, 2022.\n- [MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams](https://arxiv.org/pdf/1911.04464.pdf). *Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, Christos Faloutsos*. AAAI, 2020.\n\nThe old implementation is in another branch `OldImplementation`, it should be considered as being archived and will hardly receive feature updates.\n\n![](asset/Intro.png)\n\n## Table of Contents\n\n\u003c!-- START doctoc generated TOC please keep comment here to allow auto update --\u003e\n\u003c!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --\u003e\n\n\n- [Features](#features)\n- [Demo](#demo)\n- [Customization](#customization)\n- [Other Files](#other-files)\n- [In Other Languages](#in-other-languages)\n- [Online Coverage](#online-coverage)\n- [Citation](#citation)\n\n\u003c!-- END doctoc generated TOC please keep comment here to allow auto update --\u003e\n\n## Features\n\n- Finds Anomalies in Dynamic/Time-Evolving Graph: (Intrusion Detection, Fake Ratings, Financial Fraud)\n- Detects Microcluster Anomalies (suddenly arriving groups of suspiciously similar edges e.g. DoS attack)\n- Theoretical Guarantees on False Positive Probability\n- Constant Memory (independent of graph size)\n- Constant Update Time (real-time anomaly detection to minimize harm)\n- Up to 55% more accurate and 929 times faster than the state of the art approaches\n- Experiments are performed using the following datasets: \n  - [DARPA](https://www.ll.mit.edu/r-d/datasets/1998-darpa-intrusion-detection-evaluation-dataset)\n  - [TwitterWorldCup2014](http://odds.cs.stonybrook.edu/twitterworldcup2014-dataset)\n  - [TwitterSecurity](http://odds.cs.stonybrook.edu/twittersecurity-dataset)\n\n## Demo\n\nIf you use Windows:\n\n1. Open a Visual Studio developer command prompt, we want their toolchain\n1. `cd` to the project root `MIDAS/`\n1. `cmake -DCMAKE_BUILD_TYPE=Release -GNinja -S . -B build/release`\n1. `cmake --build build/release --target Demo`\n1. `cd` to `MIDAS/build/release/`\n1. `.\\Demo.exe`\n\nIf you use Linux/macOS:\n\n1. Open a terminal\n1. `cd` to the project root `MIDAS/`\n1. `cmake -DCMAKE_BUILD_TYPE=Release -S . -B build/release`\n1. `cmake --build build/release --target Demo`\n1. `cd` to `MIDAS/build/release/`\n1. `./Demo`\n\nThe demo runs on `MIDAS/data/DARPA/darpa_processed.csv`, which has 4.5M records, with the filtering core (MIDAS-F).\n\nThe scores will be exported to `MIDAS/temp/Score.txt`, higher means more anomalous.\n\nAll file paths are absolute and \"hardcoded\" by CMake, but it's suggested NOT to run by double clicking on the executable file.\n\n### Requirements\n\nCore\n- C++11\n- C++ standard libraries\n\nDemo (if experimental ROC-AUC impl)\n- C++ standard libraries\n\nDemo (if `sklearn` ROC-AUC impl)\n- Python 3 (`MIDAS/util/EvaluateScore.py`)\n    - `pandas`: I/O \n    - `scikit-learn`: Compute ROC-AUC\n\nExperiment\n- (Optional) Intel TBB: Parallelization\n- (Optional) OpenMP: Parallelization\n\nOther python utility scripts\n- Python 3\n    - `pandas`\n    - `scikit-learn`\n\n## Customization\n\n### Switch to `sklearn` ROC-AUC Implementation\n\nIn `MIDAS/example/Demo.cpp`.  \nComment out section \"Evaluate scores (experimental)\"  \nUncomment section \"Write output scores\" and \"Evaluate scores\".\n\n### Different CMS Size / Decay Factor / Threshold\n\nThose are arguments of cores' constructors, which are at `MIDAS/example/Demo.cpp:67-69`.\n\n### Switch Cores\n\nCores are instantiated at `MIDAS/example/Demo.cpp:67-69`, uncomment the chosen one.\n\n### Custom Dataset + `Demo.cpp`\n\nYou need to prepare three files:\n\n- Meta file\n  - Only includes an integer `N`, the number of records in the dataset\n  - Use its path for `pathMeta`\n  - E.g. `MIDAS/data/DARPA/darpa_shape.txt`\n- Data file\n  - A header-less csv format file of shape `[N,3]`\n  - Columns are sources, destinations, timestamps\n  - Use its path for `pathData`\n  - E.g. `MIDAS/data/DARPA/darpa_processed.csv`\n- Label file\n  - A header-less csv format file of shape `[N,1]`\n  - The corresponding label for data records\n    - 0 means normal record\n    - 1 means anomalous record\n  - Use its path for `pathGroundTruth`\n  - E.g. `MIDAS/data/DARPA/darpa_ground_truth.csv`\n\n### Custom Dataset + Custom Runner\n\n1. Include the header `MIDAS/src/NormalCore.hpp`, `MIDAS/src/RelationalCore.hpp` or `MIDAS/src/FilteringCore.hpp`\n1. Instantiate cores with required parameters\n1. Call `operator()` on individual data records, it returns the anomaly score for the input record\n\n## Other Files\n\n### `example/`\n\n#### `Experiment.cpp`\n\nThe code we used for experiments.   \nIt will try to use Intel TBB or OpenMP for parallelization.  \nYou should comment all but only one runner function call in the `main()` as most results are exported to `MIDAS/temp/Experiiment.csv` together with many intermediate files.\n\n#### `Reproducible.cpp`\n\nSimilar to `Demo.cpp`, but with all random parameters hardcoded and always produce the same result.  \nIt's for other developers and us to test if the implementation in other languages can produce acceptable results.  \n\n### `util/`\n\n`DeleteTempFile.py`, `EvaluateScore.py` and `ReproduceROC.py` will show their usage and a short description when executed without any argument.\n\n#### `AUROC.hpp`\n\nExperimental ROC-AUC implementation in C++11. More info at [this repo](https://github.com/liurui39660/AUROC).\n\n#### `PreprocessData.py`\n\nThe code to process the raw dataset into an easy-to-read format.  \nDatasets are always assumed to be in a folder in `MIDAS/data/`.  \nIt can process the following dataset(s)\n\n- `DARPA/darpa_original.csv` -\u003e `DARPA/darpa_processed.csv`, `DARPA/darpa_ground_truth.csv`, `DARPA/darpa_shape.txt`\n\n## In Other Languages\n\n1. Python: [Rui Liu's MIDAS.Python](https://github.com/liurui39660/MIDAS.Python), [Ritesh Kumar's pyMIDAS](https://github.com/ritesh99rakesh/pyMIDAS)\n1. Python (pybind): [Wong Mun Hou's MIDAS](https://github.com/munhouiani/MIDAS)\n1. Golang: [Steve Tan's midas](https://github.com/steve0hh/midas)\n1. Ruby: [Andrew Kane's midas](https://github.com/ankane/midas)\n1. Rust: [Scott Steele's midas_rs](https://github.com/scooter-dangle/midas_rs)\n1. R: [Tobias Heidler's MIDASwrappeR](https://github.com/pteridin/MIDASwrappeR)\n1. Java: [Joshua Tokle's MIDAS-Java](https://github.com/jotok/MIDAS-Java)\n1. Julia: [Ashrya Agrawal's MIDAS.jl](https://github.com/ashryaagr/MIDAS.jl)\n\n## Online Coverage\n\n1. [ACM TechNews](https://technews.acm.org/archives.cfm?fo=2020-05-may/may-06-2020.html)\n1. [AIhub](https://aihub.org/2020/05/01/interview-with-siddharth-bhatia-a-new-approach-for-anomaly-detection/)\n1. [Hacker News](https://news.ycombinator.com/item?id=22802604)\n1. [KDnuggets](https://www.kdnuggets.com/2020/04/midas-new-baseline-anomaly-detection-graphs.html)\n1. [Microsoft](https://techcommunity.microsoft.com/t5/azure-sentinel/announcing-the-azure-sentinel-hackathon-winners/ba-p/1548240)\n1. [Towards Data Science](https://towardsdatascience.com/controlling-fake-news-using-graphs-and-statistics-31ed116a986f)\n\n## Citation\n\nIf you use this code for your research, please consider citing our TKDD and AAAI papers.\n\n```bibtex\n@article{bhatia2022realtime,\nauthor = {Bhatia, Siddharth and Liu, Rui and Hooi, Bryan and Yoon, Minji and Shin, Kijung and Faloutsos, Christos},\ntitle = {Real-Time Anomaly Detection in Edge Streams},\nyear = {2022},\nissue_date = {August 2022},\npublisher = {Association for Computing Machinery},\naddress = {New York, NY, USA},\nvolume = {16},\nnumber = {4},\nissn = {1556-4681},\nurl = {https://doi.org/10.1145/3494564},\ndoi = {10.1145/3494564},\njournal = {ACM Trans. Knowl. Discov. Data},\nmonth = {jan},\narticleno = {75},\nnumpages = {22}\n}\n\n@inproceedings{bhatia2020midas,\n    title={MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams},\n    author={Siddharth Bhatia and Bryan Hooi and Minji Yoon and Kijung Shin and Christos Faloutsos},\n    booktitle={AAAI Conference on Artificial Intelligence (AAAI)},\n    year={2020}\n}\n```\n","funding_links":[],"categories":["异常检测包","C++","others","anomaly-detection","\u003ca name=\"cpp\"\u003e\u003c/a\u003eC++"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FStream-AD%2FMIDAS","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FStream-AD%2FMIDAS","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FStream-AD%2FMIDAS/lists"}