{"id":33117676,"url":"https://github.com/phamquiluan/RCAEval","last_synced_at":"2025-11-19T20:02:03.615Z","repository":{"id":252529114,"uuid":"840137303","full_name":"phamquiluan/RCAEval","owner":"phamquiluan","description":"[ASE'24][WWW'25] RCAEval: A Benchmark for Root Cause Analysis. https://doi.org/10.1145/3691620.3695065","archived":false,"fork":false,"pushed_at":"2025-10-28T11:50:44.000Z","size":3222,"stargazers_count":73,"open_issues_count":9,"forks_count":15,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-10-28T13:08:45.071Z","etag":null,"topics":["aiops","benchmark","itbench","microservices","root-cause-analysis","site-reliability-engineering","software-engineering","telemetry-data"],"latest_commit_sha":null,"homepage":"https://dl.acm.org/doi/10.1145/3701716.3715290","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/phamquiluan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-08-09T03:49:25.000Z","updated_at":"2025-10-25T11:41:26.000Z","dependencies_parsed_at":"2024-12-12T19:31:44.075Z","dependency_job_id":"ed25fb09-af56-4242-a225-d3c007c684e5","html_url":"https://github.com/phamquiluan/RCAEval","commit_stats":null,"previous_names":["phamquiluan/rcaeval"],"tags_count":36,"template":false,"template_full_name":null,"purl":"pkg:github/phamquiluan/RCAEval","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phamquiluan%2FRCAEval","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phamquiluan%2FRCAEval/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phamquiluan%2FRCAEval/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phamquiluan%2FRCAEval/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/phamquiluan","download_url":"https://codeload.github.com/phamquiluan/RCAEval/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phamquiluan%2FRCAEval/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":285319005,"owners_count":27151474,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-19T02:00:05.673Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aiops","benchmark","itbench","microservices","root-cause-analysis","site-reliability-engineering","software-engineering","telemetry-data"],"created_at":"2025-11-15T03:00:27.887Z","updated_at":"2025-11-19T20:02:03.610Z","avatar_url":"https://github.com/phamquiluan.png","language":"Jupyter Notebook","funding_links":[],"categories":["Misc","AI for *Ops"],"sub_categories":["Observability \u0026 Monitoring with AI"],"readme":"# 🕵️ RCAEval: A Benchmark for Root Cause Analysis of Microservice Systems\n\n[![DOI](https://zenodo.org/badge/840137303.svg)](https://doi.org/10.5281/zenodo.13294048)\n[![pypi package](https://img.shields.io/pypi/v/RCAEval.svg)](https://pypi.org/project/RCAEval)\n[![Downloads](https://static.pepy.tech/personalized-badge/rcaeval?period=total\u0026units=international_system\u0026left_color=black\u0026right_color=orange\u0026left_text=Downloads)](https://pepy.tech/project/rcaeval)\n[![CircleCI](https://dl.circleci.com/status-badge/img/gh/phamquiluan/RCAEval/tree/main.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/phamquiluan/RCAEval/tree/main)\n[![Build and test](https://github.com/phamquiluan/RCAEval/actions/workflows/build-and-test.yml/badge.svg)](https://github.com/phamquiluan/RCAEval/actions/workflows/build-and-test.yml)\n[![Upload Python Package](https://github.com/phamquiluan/RCAEval/actions/workflows/python-publish.yml/badge.svg)](https://github.com/phamquiluan/RCAEval/actions/workflows/python-publish.yml)\n\n\nRCAEval is an open-source benchmark that offers three datasets (RE1, RE2, RE3) with 735 real failure cases, and an evaluation framework for root cause analysis (RCA) in microservice systems. It includes 15 reproducible baselines covering metric-based, trace-based, and multi-source RCA methods.\n\n\n\n**_NOTE:_** The [main branch](https://github.com/phamquiluan/RCAEval/tree/main) is now under extensive development. Please refer to [CHANGE LOGS](https://github.com/phamquiluan/RCAEval/blob/main/README.md#change-logs) to find the code for our previous publications.\n\n\u003cp align=\"center\"\u003e\n\u003cimg width=1000 src= \"./docs/readme.jpg\"/\u003e\n\u003c/p\u003e\n\n[IEEE/ACM ASE 2024](https://dl.acm.org/doi/abs/10.1145/3691620.3695065)\n[ACM WWW 2025](https://dl.acm.org/doi/10.1145/3701716.3715290)\n\n**Table of Contents** \n  * [Prerequisites](#prerequisites)\n  * [Installation](#installation)\n  * [How-to-use](#how-to-use)\n    + [Data format](#data-format)\n    + [Basic usage example](#basic-usage-example)\n  * [Available Datasets](#available-datasets)\n  * [Available Baselines](#available-baselines)\n  * [Reproducibility](#reproducibility)\n    + [RCAEval Benchmark Paper](#rcaeval-benchmark-paper)\n    + [For ASE Paper](#for-ase-paper)\n  * [Creating New RCA Datasets or Methods](#creating-new-rca-datasets-or-methods)\n  * [Licensing](#licensing)\n  * [Acknowledgments](#acknowledgments)\n  * [Change Logs](#change-logs)\n  * [Citation](#citation)\n  * [Contact](#contact)\n\n## Prerequisites\n\nWe recommend using machines equipped with at least 8 cores, 16GB RAM, and ~50GB available disk space with Ubuntu 22.04 or Ubuntu 20.04, and **Python3.12**.\n\n## Installation\n\nThe `default` environment, which is used for most methods, can be easily installed as follows. Detailed installation instructions for all methods are in [SETUP.md](docs/SETUP.md).\n\n\nOpen your terminal and run the following commands\n\n```bash\nsudo apt update -y\nsudo apt install -y build-essential \\\n  libxml2 libxml2-dev zlib1g-dev \\\n  python3-tk graphviz\n```\n\nClone RCAEval from GitHub\n\n```bash\ngit clone https://github.com/phamquiluan/RCAEval.git \u0026\u0026 cd RCAEval\n```\n\nCreate virtual environment with Python 3.12 (refer [SETUP.md](docs/SETUP.md) to see how to install Python3.12 on Linux)\n\n```bash\npython3.12 -m venv env\n. env/bin/activate\n```\n\nInstall RCAEval using pip\n\n```bash\npip install -e .[default]\n```\n\nOr, install RCAEval from PyPI\n\n```bash\n# Install RCAEval from PyPI\npip install RCAEval[default]\n```\n\nTest the installation\n\n```bash\npython -m pytest tests/test.py::test_basic\n```\n\nExpected output after running the above command (it takes less than 1 minute)\n\n```bash \n$ pytest tests/test.py::test_basic\n============================== test session starts ===============================\nplatform linux -- Python 3.12.12, pytest-7.3.1, pluggy-1.0.0\nrootdir: /home/ubuntu/RCAEval\nplugins: dvc-2.57.3, hydra-core-1.3.2\ncollected 1 item                                                                 \n\ntests/test.py .                                                            [100%]\n\n=============================== 1 passed in 3.16s ================================\n```\n\n## How-to-use\n\n### Data format\n\nThe telemetry data must be presented as `pandas.DataFrame`. We require the data to have a column named `time` that stores the timestep. A sample of valid data could be downloaded using the `download_data()` or `download_multi_source_data()` method that we will demonstrate shortly below.\n\n### Basic usage example\n\nA basic example to use BARO, a metric-based RCA baseline, to perform RCA are presented as follows,\n\n```python\n# You can put the code here to a file named test.py\nfrom RCAEval.e2e import baro\nfrom RCAEval.utility import download_data, read_data\n\n# download a sample data to data.csv\ndownload_data()\n\n# read data from data.csv\ndata = read_data(\"data.csv\")\nanomaly_detected_timestamp = 1692569339\n\n# perform root cause analysis\nroot_causes = baro(data, anomaly_detected_timestamp)[\"ranks\"]\n\n# print the top 5 root causes\nprint(\"Top 5 root causes:\", root_causes[:5])\n```\n\nExpected output after running the above code (it takes around 1 minute)\n\n```\n$ python test.py\nDownloading data.csv..: 100%|████████████████████| 570k/570k [00:00\u003c00:00, 19.8MiB/s]\nTop 5 root causes: ['emailservice_mem', 'recommendationservice_mem', 'cartservice_mem', 'checkoutservice_latency', 'cartservice_latency']\n```\n\nA tutorial of using Multi-source BARO to diagnose failure using multi-source telemetry data (metrics, logs, and traces) is presented in [docs/multi-source-rca-demo.ipynb](docs/multi-source-rca-demo.ipynb). \n\nA tutorial of using BARO to diagnose code-level faults is presented in [docs/code-level-rca.ipynb](docs/code-level-rca.ipynb).\n\n\n## Available Datasets\n\nRCAEval benchmark includes three datasets: RE1, RE2, and RE3, designed to comprehensively support benchmarking RCA in microservice systems. Together, our three datasets feature 735 failure cases collected from three microservice systems (Online Boutique, Sock Shop, and Train Ticket) and including 11 fault types. Each failure case also includes annotated root cause service and root cause indicator (e.g., specific metric or log indicating the root cause). The statistics of the datasets are presented in the Table below.\n\n|   Dataset   |   Systems  |   Fault Types            |   Cases  |   Metrics  |   Logs (millions)  |   Traces (millions)  |\n|-------------|------------|--------------------------|----------|------------|--------------------|----------------------|\n|   RE1       |   3        |   3 Resource, 2 Network  |   375    |   49-212   |   N/A              |   N/A                |\n|   RE2       |   3        |   4 Resource, 2 Network  |   270    |   77-376   |   8.6-26.9         |   39.6-76.7          |\n|   RE3       |   3        |   5 Code-level           |   90     |   68-322   |   1.7-2.7          |   4.5-4.7            |\n\nOur datasets and their description are publicly available in Zenodo repository with the following information:\n- Dataset DOI: https://doi.org/10.5281/zenodo.14590730\n- Dataset URL: [https://zenodo.org/records/14590730](https://zenodo.org/records/14590730)\n\nWe also provide utility functions to download our datasets using Python. The downloaded datasets will be available at directory `data`.\n\n```python\nfrom RCAEval.utility import (\n    download_re1_dataset,\n    download_re2_dataset,\n    download_re3_dataset,\n)\n\ndownload_re1_dataset()\ndownload_re2_dataset()\ndownload_re3_dataset()\n```\n\u003cdetails\u003e\n\u003csummary\u003eExpected output after running the above code (it takes half an hour to download and extract the datasets. )\u003c/summary\u003e\n\n```\n$ python test.py\nDownloading RE1.zip..: 100%|█████████████████████| 390M/390M [01:02\u003c00:00, 6.22MiB/s]\nDownloading RE2.zip..: 100%|███████████████████| 4.21G/4.21G [11:23\u003c00:00, 6.17MiB/s]\nDownloading RE3.zip..: 100%|█████████████████████| 534M/534M [01:29\u003c00:00, 5.97MiB/s]\n```\n\u003c/details\u003e\n\n\n## Available Baselines \n\nRCAEval stores all the RCA methods in the `e2e` module (implemented in `RCAEval.e2e`). There are 15 RCA baselines available: RUN, CausalRCA, CIRCA, RCD, MicroCause, EasyRCA, MSCRED, BARO, 𝜖-Diagnosis, TraceRCA, MicroRank, PDiagnose, Multi-source BARO, Multi-source RCD, Multi-source CIRCA.\n\n## Reproducibility\n\n### RCAEval Benchmark Paper\n\nWe provide a script named `main.py` to assist in reproducing the results from [our RCAEval paper](https://arxiv.org/pdf/2412.17015). This script can be executed using Python with the following syntax: \n\n```\npython main.py [-h] [--dataset DATASET] [--method METHOD]\n```\n\nThe available options and their descriptions are as follows:\n\n```\noptions:\n  -h, --help            Show this help message and exit\n  --dataset DATASET     Choose a dataset. Valid options:\n                        [re2-ob, re2-ss, re2-tt, etc.]\n  --method METHOD       Choose a method (`causalrca`, `microcause`, `e_diagnosis`, `baro`, `rcd`, `circa`, etc.)\n```\n\nFor example, in Table 6, BARO achieves Avg@5 of 0.72, 0.99, 1, 0.83, 0.64, and 0.8 for CPU, MEM, DISK, SOCKET, DELAY, LOSS, and AVERAGE on the Train Ticket dataset. To reproduce these results, you can run the following commands:\n\n```bash\npython  main.py --method baro --dataset re2-tt\n```\n\nThe expected output should be exactly as presented in the paper (it takes less than 1 minute to run the code)\n\n```\n$ python  main.py --method baro --dataset re2-tt --length 20\n100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90/90 [00:45\u003c00:00,  1.98it/s]\n--- Evaluation results ---\nAvg@5-CPU:   0.72\nAvg@5-MEM:   0.99\nAvg@5-DISK:  1.0\nAvg@5-SOCKET: 0.83\nAvg@5-DELAY: 0.63\nAvg@5-LOSS:  0.64\n---\nAvg speed: 0.51\n```\n\nWe can replace the baro method with other methods (e.g., circa) and substitute re2-tt with other datasets to replicate the corresponding results shown in Table 6. This reproduction process is also integrated into our Continuous Integration (CI) setup. For more details, refer to the [.circleci/config.yml](.circleci/config.yml) file.\n\n\n\n### For ASE Paper\nWe provide a script named `main-ase.py` to assist in reproducing the results from [our ASE paper](https://dl.acm.org/doi/abs/10.1145/3691620.3695065). This script can be executed using Python with the following syntax: \n\n```\npython main-ase.py [-h] [--dataset DATASET] [--method METHOD] [--tdelta TDELTA] [--length LENGTH] [--test] \n```\n\nThe available options and their descriptions are as follows:\n\n```\noptions:\n  -h, --help            Show this help message and exit\n  --dataset DATASET     Choose a dataset. Valid options:\n                        [online-boutique, sock-shop-1, sock-shop-2, train-ticket,\n                         circa10, circa50, rcd10, rcd50, causil10, causil50]\n  --method METHOD       Choose a method (`pc_pagerank`, `pc_randomwalk`, `fci_pagerank`, `fci_randomwalk`, `granger_pagerank`, `granger_randomwalk`, `lingam_pagerank`, `lingam_randomwalk`, `ntlr_pagerank`, `ntlr_randomwalk`, `causalrca`, `causalai`, `run`, `microcause`, `e_diagnosis`, `baro`, `rcd`, `nsigma`, and `circa`)\n  --tdelta TDELTA       Specify $t_delta$ to simulate delay in anomaly detection (e.g.`--tdelta 60`)\n  --length LENGTH       Specify the length of the time series (used for RQ4)\n  --test                Perform smoke test on certain methods without fully run\n```\n\nFor example, in Table 5, BARO [ $t_\\Delta = 0$ ] achieves Avg@5 of 0.97, 1, 0.91, 0.98, and 0.67 for CPU, MEM, DISK, DELAY, and LOSS fault types on the Online Boutique dataset. To reproduce these results, you can run the following commands:\n\n```bash\npython main-ase.py --dataset online-boutique --method baro \n```\n\nThe expected output should be exactly as presented in the paper (it takes less than 1 minute to run the code)\n\n```\n--- Evaluation results ---\nAvg@5-CPU:   0.97\nAvg@5-MEM:   1.0\nAvg@5-DISK:  0.91\nAvg@5-DELAY: 0.98\nAvg@5-LOSS:  0.67\n---\nAvg speed: 0.07\n```\n\nAs presented in Table 5, BARO [ $t_\\Delta = 60$ ] achieves Avg@5 of 0.94, 0.99, 0.87, 0.99, and 0.6 for CPU, MEM, DISK, DELAY, and LOSS fault types on the Online Boutique dataset. To reproduce these results, you can run the following commands:\n\n```bash\npython main-ase.py --dataset online-boutique --method baro --tdelta 60\n```\n\nThe expected output should be exactly as presented in the paper (it takes less than 1 minute to run the code)\n\n```\n--- Evaluation results ---\nAvg@5-CPU:   0.94\nAvg@5-MEM:   0.99\nAvg@5-DISK:  0.87\nAvg@5-DELAY: 0.99\nAvg@5-LOSS:  0.6\n---\nAvg speed: 0.07\n```\n\nWe can replace the baro method with other methods (e.g., nsigma, fci_randomwalk) and substitute online-boutique with other datasets to replicate the corresponding results shown in Table 5. This reproduction process is also integrated into our Continuous Integration (CI) setup. For more details, refer to the [.github/workflows/reproducibility.yml](.github/workflows/reproducibility.yml) file.\n\n## Creating New RCA Datasets or Methods\n\nFor detailed guidance, refer to [EXTENDING.md](docs/EXTENDING.md).\n\n## Licensing\n\nThis repository includes code from various sources with different licenses. We have included their corresponding LICENSE into the [LICENSES](LICENSES) directory:\n\n- **BARO**: Licensed under the [MIT License](LICENSES/LICENSE-BARO). Original source: [BARO GitHub Repository](https://github.com/phamquiluan/baro/blob/main/LICENSE).\n- **CausalRCA**: No License. Original source: [CausalRCA GitHub Repository](https://github.com/AXinx/CausalRCA_code).\n- **CIRCA**: Licensed under the [BSD 3-Clause License](LICENSES/LICENSE-CIRCA). Original source: [CIRCA GitHub Repository](https://github.com/NetManAIOps/CIRCA/blob/master/LICENSE).\n- **E-Diagnosis**: Licensed under the [BSD 3-Clause License](LICENSES/LICENSE-E-Diagnosis). Original source: [PyRCA GitHub Repository](https://github.com/salesforce/PyRCA/blob/main/LICENSE).\n- **MicroCause**: Licensed under the [Apache License 2.0](LICENSES/LICENSE-MicroCause). Original source: [MicroCause GitHub Repository](https://github.com/PanYicheng/dycause_rca/blob/main/LICENSE).\n- **RCD**: Licensed under the [MIT License](LICENSES/LICENSE-RCD). Original source: [RCD GitHub Repository](https://github.com/azamikram/rcd).\n- **RUN**: No License. Original source: [RUN GitHub Repository](https://github.com/zmlin1998/RUN).\n\n**For the code implemented by us and for our datasets, we distribute them under the [MIT LICENSE](LICENSE)**.\n\n## Acknowledgments\n\nWe would like to express our sincere gratitude to the researchers and developers who created the baselines used in our study. Their work has been instrumental in making this project possible. We deeply appreciate the time, effort, and expertise that have gone into developing and maintaining these resources. This project would not have been feasible without their contributions.\n\n## Change Logs\n- [Mar 2025] The version of RCAEval used in our WWW'25 paper are available in the [www25 branch](https://github.com/phamquiluan/RCAEval/tree/www25).\n- [Dec 2024] The prior version of RCAEval used in our ASE'24 paper are available in the [ase24 branch](https://github.com/phamquiluan/RCAEval/tree/ase24).\n\n## Citation\n\n\n```bibtex\n@inproceedings{pham2025rcaeval,\n  title={RCAEval: A Benchmark for Root Cause Analysis of Microservice Systems with Telemetry Data},\n  author={Pham, Luan and Zhang, Hongyu and Ha, Huong and Salim, Flora and Zhang, Xiuzhen},\n  booktitle={Companion Proceedings of the ACM on Web Conference 2025},\n  pages={777--780},\n  year={2025}\n}\n```\n\n```bibtex\n@inproceedings{pham2024root,\n  title={Root Cause Analysis for Microservice System based on Causal Inference: How Far Are We?},\n  author={Pham, Luan and Ha, Huong and Zhang, Hongyu},\n  booktitle={Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering},\n  pages={706--715},\n  year={2024}\n}\n```\n\n```bibtex\n@inproceedings{pham2024baro,\n  title={BARO: Robust root cause analysis for microservices via multivariate bayesian online change point detection},\n  author={Pham, Luan and Ha, Huong and Zhang, Hongyu},\n  journal={Proceedings of the ACM on Software Engineering},\n  volume={1},\n  number={FSE},\n  pages={2214--2237},\n  year={2024},\n}\n```\n\n## Contact\n\n[phamquiluan\\@gmail.com](mailto:phamquiluan@gmail.com?subject=RCAEval)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphamquiluan%2FRCAEval","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fphamquiluan%2FRCAEval","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphamquiluan%2FRCAEval/lists"}