{"id":18800553,"url":"https://github.com/xtra-computing/vertibench","last_synced_at":"2025-04-13T17:31:22.845Z","repository":{"id":211697897,"uuid":"728170568","full_name":"Xtra-Computing/VertiBench","owner":"Xtra-Computing","description":"Feature partitioner by imbalance or correlation (ICLR 2024)","archived":false,"fork":false,"pushed_at":"2025-01-15T12:05:48.000Z","size":98522,"stargazers_count":16,"open_issues_count":0,"forks_count":1,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-04-10T15:23:13.765Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://vertibench.xtra.science","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Xtra-Computing.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-06T11:30:40.000Z","updated_at":"2025-04-09T10:27:52.000Z","dependencies_parsed_at":"2024-11-07T22:29:27.588Z","dependency_job_id":null,"html_url":"https://github.com/Xtra-Computing/VertiBench","commit_stats":null,"previous_names":["xtra-computing/vertibench"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xtra-Computing%2FVertiBench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xtra-Computing%2FVertiBench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xtra-Computing%2FVertiBench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xtra-Computing%2FVertiBench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Xtra-Computing","download_url":"https://codeload.github.com/Xtra-Computing/VertiBench/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248245244,"owners_count":21071441,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T22:19:01.627Z","updated_at":"2025-04-13T17:31:18.669Z","avatar_url":"https://github.com/Xtra-Computing.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# VertiBench: Vertical Federated Learning Benchmark\n\n## Introduction\n\nVertiBench is a benchmark for [federated learning](https://ieeexplore.ieee.org/abstract/document/9599369/), [split learning](https://arxiv.org/abs/1912.12115), and [assisted learning](https://proceedings.neurips.cc/paper_files/paper/2022/hash/4d6938f94ab47d32128c239a4bfedae0-Abstract-Conference.html) on vertical partitioned data. It provides tools to synthetic vertical partitioned data from a given global dataset. VertiBench supports partition under various **imbalance** and **correlation** level, effectively simulating a wide-range of real-world vertical federated learning scenarios. \n\n\n![data-dist-full.png](fig%2Fdata-dist-full.png)\n\n## Installation\n\nVertiBench has already been published on PyPI. The installation requires the installation of `python\u003e=3.9`. To further install VertiBench, run the following command:\n\n```bash\npip install vertibench\n```\n\n## Getting Started\n\nThis examples includes the pipeline of split and evaluate. First,\n load your datasets or generate synthetic datasets. \n\n```python\nfrom sklearn.datasets import make_classification\n\n# Generate a large dataset\nX, y = make_classification(n_samples=10000, n_features=10)\n```\n\nTo split the dataset by importance,\n\n```python\nfrom vertibench.Splitter import ImportanceSplitter\n\nimp_splitter = ImportanceSplitter(num_parties=4, weights=[1, 1, 1, 3])\nXs = imp_splitter.split(X)\n```\n\nTo split the dataset by correlation,\n\n```python\nfrom vertibench.Splitter import CorrelationSplitter\n\ncorr_splitter = CorrelationSplitter(num_parties=4)\nXs = corr_splitter.fit_split(X)\n```\n\nTo evaluate a feature split `Xs` in terms of party importance,\n\n```python\nfrom vertibench.Evaluator import ImportanceEvaluator\nfrom sklearn.linear_model import LogisticRegression\nimport numpy as np\n\nmodel = LogisticRegression()\nX = np.concatenate(Xs, axis=1)\nmodel.fit(X, y)\nimp_evaluator = ImportanceEvaluator()\nimp_scores = imp_evaluator.evaluate(Xs, model.predict)\nalpha = imp_evaluator.evaluate_alpha(scores=imp_scores)\nprint(f\"Importance scores: {imp_scores}, alpha: {alpha}\")\n```\n\nTo evaluate a feature split in terms of correlation,\n\n```python\nfrom vertibench.Evaluator import CorrelationEvaluator\n\ncorr_evaluator = CorrelationEvaluator()\ncorr_scores = corr_evaluator.fit_evaluate(Xs)\nbeta = corr_evaluator.evaluate_beta()\nprint(f\"Correlation scores: {corr_scores}, beta: {beta}\")\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxtra-computing%2Fvertibench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxtra-computing%2Fvertibench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxtra-computing%2Fvertibench/lists"}