{"id":24991160,"url":"https://github.com/tteofili/certa","last_synced_at":"2026-02-28T21:32:34.421Z","repository":{"id":38679019,"uuid":"333675405","full_name":"tteofili/certa","owner":"tteofili","description":"CERTA - Computing Entity Resolution explanations with TriAngles","archived":false,"fork":false,"pushed_at":"2025-01-03T06:54:18.000Z","size":28068,"stargazers_count":5,"open_issues_count":1,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-19T10:34:06.815Z","etag":null,"topics":["data-integration","entity-matching","entity-resolution","explainable-ai","machine-learning","python","record-linkage","xai"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tteofili.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-01-28T07:07:48.000Z","updated_at":"2024-12-30T22:25:46.000Z","dependencies_parsed_at":"2023-11-28T09:29:12.548Z","dependency_job_id":"2eefc8e2-def9-4a71-95bc-3851f3b87a63","html_url":"https://github.com/tteofili/certa","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/tteofili/certa","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tteofili%2Fcerta","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tteofili%2Fcerta/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tteofili%2Fcerta/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tteofili%2Fcerta/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tteofili","download_url":"https://codeload.github.com/tteofili/certa/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tteofili%2Fcerta/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29952272,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-28T18:42:55.706Z","status":"ssl_error","status_checked_at":"2026-02-28T18:42:48.811Z","response_time":90,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-integration","entity-matching","entity-resolution","explainable-ai","machine-learning","python","record-linkage","xai"],"created_at":"2025-02-04T13:47:12.017Z","updated_at":"2026-02-28T21:32:34.402Z","avatar_url":"https://github.com/tteofili.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"CERTA\n=======\n\nCode for _CERTA_ (Computing ER explanations with TriAngles), an algorithm for computing saliency and counterfactual explanations for Entity Resolution models.\n\n# Installation\n\nTo install _CERTA_ locally run :\n```shell\npip install .\n```\n\n# Usage\n\nWrap the model whose predictions need to be explained using the [ERModel](models/ermodel.py) interface.\nThe _get_model_ utility method will load an existing model, if available, or train a new one using the data in the provided dataset.\nE.g. for a _DeepMatcher_ model use:\n\n```python\nfrom certa.models.utils import get_model\n\nmodel = get_model('dm', '/path/where/to/save', '/path/to/dataset', 'modelname')\n```\n\nDefine a prediction function wrapping the _model.predict()_ method.\n\n```python\ndef predict_fn(x, **kwargs):\n    return model.predict(x, **kwargs)\n```\n\nCreate a [CertaExplainer](certa/explain.py). \n_CERTA_ needs access to the data sources _lsource_ and _rsource_. \n\n```python\nimport pandas as pd\nfrom certa.explain import CertaExplainer\n\nlsource = pd.read_csv('/path/to/dataset/tableA.csv')\nrsource = pd.read_csv('/path/to/dataset/tableB.csv')\ncerta_explainer = CertaExplainer(lsource, rsource)\n```\n\nTo generate the prediction for the first two records in the data sources, do the following:\n\n```python\nimport numpy as np\nfrom certa.local_explain import get_original_prediction\n\nl_tuple = lsource.iloc[0]\nr_tuple = rsource.iloc[0]\nprediction = get_original_prediction(l_tuple, r_tuple, predict_fn)\nclass_to_explain = np.argmax(prediction)\n```\n\nTo explain the prediction using _CERTA_ :\n\n```python\nsaliency, summary, cfs, triangles, lattices = certa_explainer.explain(l_tuple, r_tuple, predict_fn)\n```\n_CERTA_ returns:\n* the saliency explanation within the _saliency_ pd.DataFrame \n* a _summary_ containing the set of attributes that has the highest probability of sufficiency of flipping the original prediction\n* the generated counterfactual explanations within the _cfs_ pd.DataFrame \n* the list of open _triangles_ (in form of tuples of record ids) used to generate the explanations\n\n# Examples\n\nExamples of using _CERTA_ can be found in the following notebooks:\n* [Explain DeepMatcher predictions](notebooks/sample.ipynb)\n* [Explain Ditto predictions](https://gist.github.com/tteofili/b4c81a3de6aef40e8dfa27eaf22f116d)\n\n# Citing CERTA\n\nIf you extend or use this work, please cite the [paper](https://arxiv.org/abs/2203.12978):\n\n```\n@article{teofili2022effective,\n  title={Effective Explanations for Entity Resolution Models},\n  author={Teofili, Tommaso and Firmani, Donatella and Koudas, Nick and Martello, Vincenzo and Merialdo, Paolo and Srivastava, Divesh},\n  journal={arXiv preprint arXiv:2203.12978},\n  year={2022}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftteofili%2Fcerta","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftteofili%2Fcerta","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftteofili%2Fcerta/lists"}