{"id":21905900,"url":"https://github.com/cosmian/mpc_join","last_synced_at":"2025-04-15T23:49:16.417Z","repository":{"id":51387588,"uuid":"354004915","full_name":"Cosmian/mpc_join","owner":"Cosmian","description":"CipherCompute: Blind Join for Confidential Data Science and Federated Learning using MPC","archived":false,"fork":false,"pushed_at":"2022-11-14T09:05:58.000Z","size":1425,"stargazers_count":3,"open_issues_count":0,"forks_count":2,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-29T02:51:30.136Z","etag":null,"topics":["confidential-computing","cryptography","distributed-computing","multiparty-computation"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Cosmian.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-04-02T11:56:17.000Z","updated_at":"2024-05-29T08:31:39.000Z","dependencies_parsed_at":"2022-09-09T22:50:35.747Z","dependency_job_id":null,"html_url":"https://github.com/Cosmian/mpc_join","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cosmian%2Fmpc_join","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cosmian%2Fmpc_join/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cosmian%2Fmpc_join/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cosmian%2Fmpc_join/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Cosmian","download_url":"https://codeload.github.com/Cosmian/mpc_join/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249173061,"owners_count":21224481,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["confidential-computing","cryptography","distributed-computing","multiparty-computation"],"created_at":"2024-11-28T16:39:10.195Z","updated_at":"2025-04-15T23:49:16.403Z","avatar_url":"https://github.com/Cosmian.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MPC Join (Confidential Data Science)\n\nBob and Charlie have common customers and Alice would like to study the relationships between the amounts spent at Bob's, the category of purchases at Charlie's and the Customer Satisfaction Index recorded by both.\n\nBob and Charlie are interested in Alice analysis results, however neither Bob nor Charlie wants to:\n - reveal their full list of customers to the other two\n - reveal their common customer IDs to Alice\n - reveal their satisfaction index to Alice\n\n Alice is going to use [CipherCompute EAP](https://github.com/Cosmian/CipherCompute) to perform a confidential calculation that will secretly perform an inner join (an intersection) between Bob and Charlie customers datasets and for every matching customer row:\n\n - drop the customer ID before outputting the intersection to Alice\n - output the `amount spent` (from Bob's) and the `category` (from Charlie's)\n - output a composite satisfaction index calculated as: `2 x Charlie's + 3 x Bob's`\n\nFor example:\n\n    Bob Secret Input:\n        id  |  amount | sat. index\n     -------|---------|------------ \n      17600 |   82721 |         83 \n      21000 |   45234 |         65       \n\n    Charlie Secret Input:\n        id  |  categ. | sat. index\n     -------|---------|------------ \n      13200 |       1 |         36 \n      17600 |       3 |         97 \n\n    ==\u003e Output to Alice (match on id=17600):\n    \n     amount |  categ. | composite\n     -------|---------|---------- \n      82721 |       3 |     457 \n\n\n\nThe code in `src/main.rs` implements this confidential collaborative computation and shows how to\n\n - perform a secret inner join in O(n) using a merge join which assumes the datasets are sorted by ID ascending. Is is very easy to change this code to perform other kind of secret joins: left, right or full outer join.\n - perform some secret arithmetics combining secret data from 2 different datasets and scalar values.\n\n\n## Data Science: confidential datasets preparation\n\nThis example illustrates how to solve one of the biggest pain in performing data science on confidential data from multiple sources: generating the dataset to be analyzed in the first place!\n\nIt also illustrates how using CipherCompute makes it possible to secretly manipulate IDs (to perform a join here) then anonymize the dataset by simply dropping them before output.\n\nA nice differential privacy touch here, would be to add some noise from a Laplace distribution on Bob's amounts before running the MPC computation.\n\n\n## Hack it !\n\nThe code is heavily documented and under MIT license as it is meant to be hacked for your purpose.\n\nIt is actually very easy to generalize this code to a lot of confidential datasets generation problems.\n\nDo not hesitate to open issues and PRs to improve the quality of this code \nand its documentation.\n\n## Editing and testing\n\nOnce you have cloned this repository locally, edit the code; \nwe recommend that you use the free VSCode and rust-analyzer extension.\n\nTo check the validity of your code, simply run  `cargo build`. \nThe build process outputs [WASM](https://fr.wikipedia.org/wiki/WebAssembly) which\nis what is actually fed as an intermediate representation to the CipherCompute engine.\n\nTo facilitate testing without having to run [CipherCompute EAP](https://github.com/Cosmian/CipherCompute),  2 facilities are provided via 2 scripts:\n\n - `emulate.sh` will pick up the input data in the `data/inputs` directory \n  and output its results in the `data/outputs` directory. These directories contain one \n  file per player. This scripts perform the same emulation as that provided on the CipherCompute UI. \n\n - `test.sh` will run the unit tests of the `main.rs` file. For a test written \n   ```rust\n   #[test]\n    fn test_example() {\n        // An example of a successful test\n        // which input and expected output data are located\n        // in the `fixtures/first_test` folder\n        cosmian_std::test!(\"first_test\");\n        // If you change any data in the input or output files,\n        // the test will fail\n    }\n    ```\n    The input data will be picked up from the `fixtures/first_test/inputs` directory and\n    the outputs will be **compared** to those of the `fixtures/first_test/outputs` directory.\n\n## Testing inside the CipherCompute MPC engine\n\n1. Make a change and test it using `./simulate.sh`\n2. commit the change to the local git and note the git commit\n\n3. Then use the `git-daemon.sh` script to launch a git daemon which exposes this project at\nat a git URL displayed by the script\n\nFrom the UI on the CipherCompute EAP version\n\n4. Create/update a computation using the git URL above and the git commit you want to test\n5. Run the computation from the UI\n\nSee the [CipherCompute EAP](https://github.com/Cosmian/CipherCompute) Quick Start Guide\non how to use its UI to configure a computation.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcosmian%2Fmpc_join","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcosmian%2Fmpc_join","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcosmian%2Fmpc_join/lists"}