{"id":13641504,"url":"https://github.com/mithril-security/bastionlab","last_synced_at":"2025-09-11T18:43:37.155Z","repository":{"id":56718493,"uuid":"512694528","full_name":"mithril-security/bastionlab","owner":"mithril-security","description":"A simple framework for privacy-friendly data science collaboration","archived":false,"fork":false,"pushed_at":"2023-09-29T14:08:07.000Z","size":19137,"stargazers_count":172,"open_issues_count":14,"forks_count":11,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-09-08T14:28:31.122Z","etag":null,"topics":["deep-learning","eda","multi-party","privacy","pytorch","secure-enclave"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mithril-security.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-07-11T09:31:54.000Z","updated_at":"2025-08-26T00:06:01.000Z","dependencies_parsed_at":"2024-01-07T01:43:18.091Z","dependency_job_id":"0b722d7e-dc57-4878-bb23-3c16d1607af9","html_url":"https://github.com/mithril-security/bastionlab","commit_stats":null,"previous_names":["mithril-security/bastionai"],"tags_count":14,"template":false,"template_full_name":null,"purl":"pkg:github/mithril-security/bastionlab","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mithril-security%2Fbastionlab","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mithril-security%2Fbastionlab/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mithril-security%2Fbastionlab/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mithril-security%2Fbastionlab/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mithril-security","download_url":"https://codeload.github.com/mithril-security/bastionlab/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mithril-security%2Fbastionlab/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274219220,"owners_count":25243366,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-08T02:00:09.813Z","response_time":121,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","eda","multi-party","privacy","pytorch","secure-enclave"],"created_at":"2024-08-02T01:01:21.289Z","updated_at":"2025-09-11T18:43:37.079Z","avatar_url":"https://github.com/mithril-security.png","language":"Rust","funding_links":[],"categories":["Applications","Privacy and Security"],"sub_categories":["Library OSes and SDKs"],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/assets/logo.png\" alt=\"BastionLab\" width=\"200\" height=\"200\" /\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003eMithril Security – BastionLab\u003c/h1\u003e\n\n\u003ch4 align=\"center\"\u003e\n  \u003ca href=\"https://www.mithrilsecurity.io\"\u003eWebsite\u003c/a\u003e |\n  \u003ca href=\"https://bastionlab.readthedocs.io/en/latest/\"\u003eDocumentation\u003c/a\u003e |\n  \u003ca href=\"https://discord.gg/TxEHagpWd4\"\u003eDiscord\u003c/a\u003e |\n  \u003ca href=\"https://blog.mithrilsecurity.io/\"\u003eBlog\u003c/a\u003e |\n  \u003ca href=\"https://www.linkedin.com/company/mithril-security-company\"\u003eLinkedIn\u003c/a\u003e | \n  \u003ca href=\"https://www.twitter.com/mithrilsecurity\"\u003eTwitter\u003c/a\u003e\n\u003c/h4\u003e\u003cbr\u003e\n\n# 👋 Welcome to BastionLab! \n\nWhere data owners and data scientists can securely collaborate without exposing data - opening the way to projects that were too risky to consider.\n\n## ⚙️ What is BastionLab?\n\n**BastionLab is a simple privacy framework for data science collaboration, covering data exploration and AI training.** \n\nIt acts like an **access control solution**, for data owners to protect the privacy of their datasets, and **stands as a guard**, to enforce that only privacy-friendly operations are allowed on the data and anonymized outputs are shown to the data scientist. \n\n- Data owners can let external or internal data scientists **explore and extract values from their datasets, according to a strict privacy policy** they'll define in BastionLab.\n- Data scientists can **remotely run queries on data frames and train their models without seeing the original data or intermediary results**.\n\n**BastionLab is an open-source project.**\nOur solution is coded in Rust 🦀, uses Polars 🐻, a pandas-like library for data exploration, and Torch 🔥, a popular library for AI training.\nWe also have an option to set-up confidential computing 🔒, a hardware-based technology that ensures no one but the processor of the machine can see the data or the model.\n\n## 🚀 Quick tour\n\nYou can go try out our [Quick tour](https://bastionlab.readthedocs.io/en/latest/docs/quick-tour/quick-tour/) in the documentation to discover BastionLab with a hands-on example using the famous Titanic dataset. \n\nBut here’s a taste of what using BastionLab could look like 🍒\n\n### Data exploration\n\n#### Data owner's side\n```py\n# Load your dataset using polars.\n\u003e\u003e\u003e import polars as pl\n\u003e\u003e\u003e df = pl.read_csv(\"titanic.csv\")\n\n# Define a custom policy for your data.\n# In this example, requests that aggregate at least 10 rows are safe.\n# Other requests will be reviewed by the data owner.\n\u003e\u003e\u003e from bastionlab.polars.policy import Policy, Aggregation, Review\n\u003e\u003e\u003e policy = Policy(safe_zone=Aggregation(min_agg_size=10), unsafe_handling=Review())\n\n# Upload your dataset to the server.\n# Optionally anonymize sensitive columns.\n# The server returns a remote object that can be used to query the dataset.\n\u003e\u003e\u003e from bastionlab import Connection\n\u003e\u003e\u003e with Connection(\"bastionlab.example.com\") as client:\n...     rdf = client.polars.send_df(df, policy=policy, sanitized_columns=[\"Name\"])\n...     rdf\n...\nFetchableLazyFrame(identifier=3a2d15c5-9f9d-4ced-9234-d9465050edb1)\n```\n\n#### Data scientist's side\n```py\n# List the datasets made available by the data owner, select one and get a remote object.\n\u003e\u003e\u003e from bastionlab import Connection\n\u003e\u003e\u003e connection = Connection(\"localhost\")\n\u003e\u003e\u003e all_remote_dfs = connection.client.polars.list_dfs()\n\u003e\u003e\u003e remote_df = all_remote_dfs[0]\n\n# Run unsafe queries such as displaying the five first rows.\n# According to the policy, unsafe queries require the data owner's approval.\n\u003e\u003e\u003e remote_df.head(5).collect().fetch()\nWarning: non privacy-preserving queries necessitate data owner's approval.\nReason: Only 1 subrules matched but at least 2 are required.\nFailed sub rules are:\nRule #1: Cannot fetch a DataFrame that does not aggregate at least 10 rows of the initial dataframe uploaded by the data owner.\n\nA notification has been sent to the data owner. The request will be pending until the data owner accepts or denies it or until timeout seconds elapse.\nThe query has been accepted by the data owner.\nshape: (5, 12)\n┌─────────────┬──────────┬────────┬──────┬─────┬──────────────────┬─────────┬───────┬──────────┐\n│ PassengerId ┆ Survived ┆ Pclass ┆ Name ┆ ... ┆ Ticket           ┆ Fare    ┆ Cabin ┆ Embarked │\n│ ---         ┆ ---      ┆ ---    ┆ ---  ┆     ┆ ---              ┆ ---     ┆ ---   ┆ ---      │\n│ i64         ┆ i64      ┆ i64    ┆ str  ┆     ┆ str              ┆ f64     ┆ str   ┆ str      │\n╞═════════════╪══════════╪════════╪══════╪═════╪══════════════════╪═════════╪═══════╪══════════╡\n│ 1           ┆ 0        ┆ 3      ┆ null ┆ ... ┆ A/5 21171        ┆ 7.25    ┆ null  ┆ S        │\n├╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤\n│ 2           ┆ 1        ┆ 1      ┆ null ┆ ... ┆ PC 17599         ┆ 71.2833 ┆ C85   ┆ C        │\n├╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤\n│ 3           ┆ 1        ┆ 3      ┆ null ┆ ... ┆ STON/O2. 3101282 ┆ 7.925   ┆ null  ┆ S        │\n├╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤\n│ 4           ┆ 1        ┆ 1      ┆ null ┆ ... ┆ 113803           ┆ 53.1    ┆ C123  ┆ S        │\n├╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤\n│ 5           ┆ 0        ┆ 3      ┆ null ┆ ... ┆ 373450           ┆ 8.05    ┆ null  ┆ S        │\n└─────────────┴──────────┴────────┴──────┴─────┴──────────────────┴─────────┴───────┴──────────┘\n\n# Run safe queries and get the result right away.\n\u003e\u003e\u003e (\n... remote_df\n... .select([pl.col(\"Pclass\"), pl.col(\"Survived\")])\n... .groupby(pl.col(\"Pclass\"))\n... .agg(pl.col(\"Survived\").mean())\n... .sort(\"Survived\", reverse=True)\n... .collect()\n... .fetch()\n... )\nshape: (3, 2)\n┌────────┬──────────┐\n│ Pclass ┆ Survived │\n│ ---    ┆ ---      │\n│ i64    ┆ f64      │\n╞════════╪══════════╡\n│ 1      ┆ 0.62963  │\n├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤\n│ 2      ┆ 0.472826 │\n├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤\n│ 3      ┆ 0.242363 │\n└────────┴──────────┘\n```\n\n### AI training\n\n### Data owner's side\n\n```py\n\u003e\u003e\u003e from torchvision.datasets import CIFAR100\n\u003e\u003e\u003e from torchvision.transforms import ToTensor, Normalize, Compose\n\u003e\u003e\u003e from bastionlab.client import Connection\n\n# Define a transformation pipeline for the CIFAR dataset.\n# The last step is there for shape compatibility reasons.\n\u003e\u003e\u003e transform = Compose([\n...     ToTensor(),\n...     Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),\n...     lambda x: [x.squeeze(0)],\n... ])\n\n# Define train and test datasets\n\u003e\u003e\u003e train_dataset = CIFAR100(\"data\", train=True, transform=transform, download=True)\nFiles already downloaded and verified\n\u003e\u003e\u003e test_dataset = CIFAR100(\"data\", train=False, transform=transform, download=True)\nFiles already downloaded and verified\n\n# Send them to the server by instantiating a RemoteDataset.\n\u003e\u003e\u003e with Connection(\"localhost\") as client:\n...     client.torch.RemoteDataset(train_dataset, test_dataset, name=\"CIFAR100\")\n...\nSending CIFAR100: 100%|████████████████████| 615M/615M [00:04\u003c00:00, 150MB/s]  \nSending CIFAR100 (test): 100%|████████████████████| 123M/123M [00:00\u003c00:00, 150MB/s]\n\u003cbastionlab.torch.learner.RemoteDataset object at 0x7f1220063ac0\u003e\n```\n\n### Data scientist's side\n\n```py\n\u003e\u003e\u003e from torchvision.models import efficientnet_b0\n\u003e\u003e\u003e from bastionlab.client import Connection\n\n# Define the model\n\u003e\u003e\u003e model = efficientnet_b0()\n\n# List the datasets made available by the data owner, select one and get a remote object.\n\u003e\u003e\u003e connection = Connection(\"localhost\")\n\u003e\u003e\u003e remote_datasets = connection.client.torch.list_remote_datasets()\n\u003e\u003e\u003e remote_dataset = remote_datasets[0]\n\n# Send the model to the server by instantiating a RemoteLearner\n# The RemoteLearner objects references the RemoteDataset.\n\u003e\u003e\u003e remote_learner = connection.client.torch.RemoteLearner(\n...     model,\n...     remote_dataset,\n...     max_batch_size=64,\n...     loss=\"cross_entropy\",\n...     model_name=\"EfficientNet-B0\",\n...     device=\"cpu\",\n... )\nSending EfficientNet-B0: 100%|████████████████████| 21.7M/21.7M [00:00\u003c00:00, 531MB/s]\n\n# Train the remote model for given amount of epochs\n\u003e\u003e\u003e remote_learner.fit(nb_epochs=1)\nEpoch 1/1 - train: 100%|████████████████████| 781/781 [04:06\u003c00:00,  3.17batch/s, cross_entropy=4.1798 (+/- 0.0000)]\n\n# Test the remote model\n\u003e\u003e\u003e remote_learner.test(metric=\"accuracy\")\nEpoch 1/1 - test: 100%|████████████████████| 156/156 [00:14\u003c00:00, 10.62batch/s, accuracy=0.1123 (+/- 0.0000)]\n```\n\n## 🗝️ Key features\n\n- **Access control**: data owners can define an interactive privacy policy that will filter the data scientist queries. They do not have to open unrestricted access to their datasets anymore. \n- **Limited expressivity**: BastionLab limits the type of operations that can be executed by the data scientists to avoid arbitrary code execution.\n- **Transparent remote access**: the data scientists never access the dataset directly. They only manipulate a local object that contains metadata to interact with a remotely hosted dataset. Calls can always be seen by data owners.\n\n## 🙋 Getting help\n\n- Go to our [Discord](https://discord.com/invite/TxEHagpWd4) #support channel\n- Report bugs by [opening an issue on our BastionLab Github](https://github.com/mithril-security/bastionlab/issues)\n- [Book a meeting](https://calendly.com/contact-mithril-security/15mins?month=2022-11) with us\n\n## 🚨 Disclaimer\n\nBastionLab is still in development. **Do not use it yet in a production workload.** We will audit our solution in the future to attest that it enforces the security standards of the market. \n\n## 📝 License\n\nBastionLab is licensed under the Apache License, Version 2.0.\n\n*Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.* \n\n*[See the License](http://www.apache.org/licenses/LICENSE-2.0) for the specific language governing permissions and limitations under the License.*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmithril-security%2Fbastionlab","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmithril-security%2Fbastionlab","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmithril-security%2Fbastionlab/lists"}