{"id":18483633,"url":"https://github.com/ethicalml/explainability-and-bias","last_synced_at":"2026-01-24T19:50:40.646Z","repository":{"id":40977506,"uuid":"170299690","full_name":"EthicalML/explainability-and-bias","owner":"EthicalML","description":null,"archived":false,"fork":false,"pushed_at":"2023-07-06T21:27:08.000Z","size":30268,"stargazers_count":101,"open_issues_count":4,"forks_count":22,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-02-16T21:33:05.399Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EthicalML.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-12T10:35:24.000Z","updated_at":"2024-10-02T04:41:12.000Z","dependencies_parsed_at":"2025-02-16T21:41:07.799Z","dependency_job_id":null,"html_url":"https://github.com/EthicalML/explainability-and-bias","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/EthicalML/explainability-and-bias","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EthicalML%2Fexplainability-and-bias","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EthicalML%2Fexplainability-and-bias/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EthicalML%2Fexplainability-and-bias/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EthicalML%2Fexplainability-and-bias/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EthicalML","download_url":"https://codeload.github.com/EthicalML/explainability-and-bias/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EthicalML%2Fexplainability-and-bias/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28735371,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-24T19:23:36.361Z","status":"ssl_error","status_checked_at":"2026-01-24T19:23:28.966Z","response_time":89,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T12:36:41.666Z","updated_at":"2026-01-24T19:50:40.628Z","avatar_url":"https://github.com/EthicalML.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Awesome](images/awesome.svg)](https://github.com/sindresorhus/awesome)\n[![Maintenance](https://img.shields.io/badge/Maintained%3F-YES-green.svg)](https://GitHub.com/Naereen/StrapDown.js/graphs/commit-activity)\n![GitHub](https://img.shields.io/badge/Languages-MULTI-blue.svg)\n![GitHub](https://img.shields.io/badge/License-MIT-lightgrey.svg)\n[![GitHub](https://img.shields.io/twitter/follow/axsaucedo.svg?label=Follow)](https://twitter.com/AxSaucedo/)\n\t\n\n# A practical guide towards explainability and bias evaluation in machine learning\n\nThis repo contains the full Jupyter Notebook and code for the Python talk on machine learning explainabilty and algorithmic bias. \n\n## YouTube Video of Talk\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"30%\"\u003e\n        This \u003ca href=\"https://www.youtube.com/watch?v=vq8mDiDODhc\"\u003eVideo of talk presented at PyData London 2019 \u003c/a\u003e which provides an overview on the motivations for machine learning explainability as well as techniques to introduce explainability and mitigate undesired biases.\n    \u003c/td\u003e\n    \u003ctd width=\"70%\"\u003e\n        \u003ca href=\"https://www.youtube.com/watch?v=vq8mDiDODhc\"\u003e\u003cimg src=\"images/video-pydata.jpg\"\u003e\u003c/a\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n## Live Slides (Reveal.JS)\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"30%\"\u003e\n         The presentation was performed using the \u003ca href=\"https://github.com/damianavila/RISE\"\u003eRISE plugin\u003c/a\u003e to convert the Jupyter notebook into a reveal.js presentation. The reveal.js presentation is hosted live in this repo under the \u003ca href=\"https://ethicalml.github.io/explainability-and-bias/#/1\"\u003eindex.html\u003c/a\u003e page.\n    \u003c/td\u003e\n    \u003ctd width=\"70%\"\u003e\n        \u003ca href=\"https://ethicalml.github.io/explainability-and-bias/#/1\"\u003e\u003cimg src=\"images/cover.jpg\"\u003e\u003c/a\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n## Examples to try it yourself\n\nCode examples to try it yourself:\n* [Data analysis for data imbalances with XAI](https://github.com/EthicalML/xai/blob/master/examples/XAI%20Tabular%20Data%20Example%20Usage.ipynb)\n* [Black box model evaluation for MNISt with Alibi](https://github.com/seldonio/seldon-core/tree/master/examples/explainers/alibi_anchor_tabular)\n* [Production monitoring with Seldon and Alibi](https://github.com/seldonio/seldon-core/tree/master/examples/explainers/alibi_anchor_tabular)\n\n## Open Source Tools used\n\nThis example uses the following open source libraries:\n* \u003ca href=\"https://github.com/EthicalML/XAI\"\u003eXAI\u003c/a\u003e - We use XAI to showcase data analysis techniques \n* \u003ca href=\"https://github.com/SeldonIO/Alibi\"\u003eAlibi\u003c/a\u003e - We use Alibi to dive into black box model evaluation techniques \n* \u003ca href=\"https://github.com/SeldonIO/Seldon-core\"\u003eSeldon Core\u003c/a\u003e - We use seldon core to deploy and serve ML models and ML explainers\n\n# Summarised version in markdown format\nIn this next section below you can find the sumarised version of [Jupyter notebook]() / [presentation slides]() in Markdown format.\n\n## Contents\nThis section below contains the code blocks that summarise the 3 steps proposed in the presentation proposed for explainability: 1) Data analysis, 2) Model evaluation and 3) Production monitoring.\n\n# 1) Data Analysis\n\n#### Points to cover\n\n1.1) Data imbalances\n\n1.2) Upsampling / downsampling\n\n1.3) Correlations\n\n1.4) Train / test set\n\n1.5) Further techniques\n\n# XAI - eXplainable AI \n\nWe'll be using the XAI library which is a set of tools to explain machine learning data\n\n\u003cbr\u003e\n\n\u003cimg src=\"images/xai.jpg\" style=\"width=100vw\"\u003e\n\n\u003cbr\u003e\n\n## https://github.com/EthicalML/XAI\n\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n\u003chr\u003e\n\n## Let's get the new training dataset\n\n\n```python\nX, y, X_train, X_valid, y_train, y_valid, X_display, y_display, df, df_display \\\n    = get_dataset_2()\ndf_display.head()\n```\n\n\n\n\n\u003cdiv\u003e\n\u003cstyle scoped\u003e\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n\u003c/style\u003e\n\u003ctable border=\"1\" class=\"dataframe\"\u003e\n  \u003cthead\u003e\n    \u003ctr style=\"text-align: right;\"\u003e\n      \u003cth\u003e\u003c/th\u003e\n      \u003cth\u003eage\u003c/th\u003e\n      \u003cth\u003eworkclass\u003c/th\u003e\n      \u003cth\u003eeducation\u003c/th\u003e\n      \u003cth\u003eeducation-num\u003c/th\u003e\n      \u003cth\u003emarital-status\u003c/th\u003e\n      \u003cth\u003eoccupation\u003c/th\u003e\n      \u003cth\u003erelationship\u003c/th\u003e\n      \u003cth\u003eethnicity\u003c/th\u003e\n      \u003cth\u003egender\u003c/th\u003e\n      \u003cth\u003ecapital-gain\u003c/th\u003e\n      \u003cth\u003ecapital-loss\u003c/th\u003e\n      \u003cth\u003ehours-per-week\u003c/th\u003e\n      \u003cth\u003enative-country\u003c/th\u003e\n      \u003cth\u003eloan\u003c/th\u003e\n    \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n      \u003cth\u003e0\u003c/th\u003e\n      \u003ctd\u003e39\u003c/td\u003e\n      \u003ctd\u003eState-gov\u003c/td\u003e\n      \u003ctd\u003eBachelors\u003c/td\u003e\n      \u003ctd\u003e13\u003c/td\u003e\n      \u003ctd\u003eNever-married\u003c/td\u003e\n      \u003ctd\u003eAdm-clerical\u003c/td\u003e\n      \u003ctd\u003eNot-in-family\u003c/td\u003e\n      \u003ctd\u003eWhite\u003c/td\u003e\n      \u003ctd\u003eMale\u003c/td\u003e\n      \u003ctd\u003e2174\u003c/td\u003e\n      \u003ctd\u003e0\u003c/td\u003e\n      \u003ctd\u003e40\u003c/td\u003e\n      \u003ctd\u003eUnited-States\u003c/td\u003e\n      \u003ctd\u003eFalse\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e1\u003c/th\u003e\n      \u003ctd\u003e50\u003c/td\u003e\n      \u003ctd\u003eSelf-emp-not-inc\u003c/td\u003e\n      \u003ctd\u003eBachelors\u003c/td\u003e\n      \u003ctd\u003e13\u003c/td\u003e\n      \u003ctd\u003eMarried-civ-spouse\u003c/td\u003e\n      \u003ctd\u003eExec-managerial\u003c/td\u003e\n      \u003ctd\u003eHusband\u003c/td\u003e\n      \u003ctd\u003eWhite\u003c/td\u003e\n      \u003ctd\u003eMale\u003c/td\u003e\n      \u003ctd\u003e0\u003c/td\u003e\n      \u003ctd\u003e0\u003c/td\u003e\n      \u003ctd\u003e13\u003c/td\u003e\n      \u003ctd\u003eUnited-States\u003c/td\u003e\n      \u003ctd\u003eFalse\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e2\u003c/th\u003e\n      \u003ctd\u003e38\u003c/td\u003e\n      \u003ctd\u003ePrivate\u003c/td\u003e\n      \u003ctd\u003eHS-grad\u003c/td\u003e\n      \u003ctd\u003e9\u003c/td\u003e\n      \u003ctd\u003eDivorced\u003c/td\u003e\n      \u003ctd\u003eHandlers-cleaners\u003c/td\u003e\n      \u003ctd\u003eNot-in-family\u003c/td\u003e\n      \u003ctd\u003eWhite\u003c/td\u003e\n      \u003ctd\u003eMale\u003c/td\u003e\n      \u003ctd\u003e0\u003c/td\u003e\n      \u003ctd\u003e0\u003c/td\u003e\n      \u003ctd\u003e40\u003c/td\u003e\n      \u003ctd\u003eUnited-States\u003c/td\u003e\n      \u003ctd\u003eFalse\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e3\u003c/th\u003e\n      \u003ctd\u003e53\u003c/td\u003e\n      \u003ctd\u003ePrivate\u003c/td\u003e\n      \u003ctd\u003e11th\u003c/td\u003e\n      \u003ctd\u003e7\u003c/td\u003e\n      \u003ctd\u003eMarried-civ-spouse\u003c/td\u003e\n      \u003ctd\u003eHandlers-cleaners\u003c/td\u003e\n      \u003ctd\u003eHusband\u003c/td\u003e\n      \u003ctd\u003eBlack\u003c/td\u003e\n      \u003ctd\u003eMale\u003c/td\u003e\n      \u003ctd\u003e0\u003c/td\u003e\n      \u003ctd\u003e0\u003c/td\u003e\n      \u003ctd\u003e40\u003c/td\u003e\n      \u003ctd\u003eUnited-States\u003c/td\u003e\n      \u003ctd\u003eFalse\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e4\u003c/th\u003e\n      \u003ctd\u003e28\u003c/td\u003e\n      \u003ctd\u003ePrivate\u003c/td\u003e\n      \u003ctd\u003eBachelors\u003c/td\u003e\n      \u003ctd\u003e13\u003c/td\u003e\n      \u003ctd\u003eMarried-civ-spouse\u003c/td\u003e\n      \u003ctd\u003eProf-specialty\u003c/td\u003e\n      \u003ctd\u003eWife\u003c/td\u003e\n      \u003ctd\u003eBlack\u003c/td\u003e\n      \u003ctd\u003eFemale\u003c/td\u003e\n      \u003ctd\u003e0\u003c/td\u003e\n      \u003ctd\u003e0\u003c/td\u003e\n      \u003ctd\u003e40\u003c/td\u003e\n      \u003ctd\u003eCuba\u003c/td\u003e\n      \u003ctd\u003eFalse\u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\u003c/div\u003e\n\n\n\n## 1.1) Data imbalances\n#### We can visualise the imbalances by looking at the number of examples for each class\n\n\n```python\nim = xai.imbalance_plot(df_display, \"gender\", threshold=0.55, categorical_cols=[\"gender\"])\n```\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_38_0.png)\n\n\n#### We can evaluate imbalances by the product of multiple categories\n\n\n```python\nim = xai.imbalance_plot(df_display, \"gender\", \"loan\" , categorical_cols=[\"loan\", \"gender\"])\n```\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_40_0.png)\n\n\n#### For numeric datasets we can break it down in bins\n\n\n```python\nim = xai.imbalance_plot(df_display, \"age\" , bins=10)\n```\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_42_1.png)\n\n\n## 1.2) Upsampling / Downsampling\n\n\n```python\nim = xai.balance(df_display, \"ethnicity\", \"loan\", categorical_cols=[\"ethnicity\", \"loan\"],\n                upsample=0.5, downsample=0.5, bins=5)\n```\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_44_0.png)\n\n\n## 1.3 Correlations hidden in data\n#### We can identify potential correlations across variables through a dendogram visualiation\n\n\n```python\ncorr = xai.correlations(df_display, include_categorical=True)\n```\n\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_46_1.png)\n\n\n## 1.4) Balanced train/testing sets\n\n\n```python\nX_train_balanced, y_train_balanced, X_valid_balanced, y_valid_balanced, train_idx, test_idx = \\\n    xai.balanced_train_test_split(\n            X, y, \"gender\", \n            min_per_group=300,\n            max_per_group=300,\n            categorical_cols=[\"gender\", \"loan\"])\n\nX_valid_balanced[\"loan\"] = y_valid_balanced\nim = xai.imbalance_plot(X_valid_balanced, \"gender\", \"loan\", categorical_cols=[\"gender\", \"loan\"])\n```\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_48_0.png)\n\n\n## 1.5 Shoutout to other tools and techniques\nhttps://github.com/EthicalML/awesome-production-machine-learning#industrial-strength-visualisation-libraries\n![](images/dataviz.jpg)\n\n# 2) Model evaluation\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n\u003chr\u003e\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n#### Points to cover\n\n2.1) Standard model evaluation metrics\n\n2.2) Global model explanation techniques\n\n2.3) Black box local model explanation techniques\n\n2.4) Other libraries available\n\n# Alibi - Black Box Model Explanations\n\n\u003cbr\u003e\n\n## A set of proven scientific techniques to explain ML models as black boxes\n\n\u003cbr\u003e\n\n\u003cimg src=\"images/alibi-repo-new.jpg\" style=\"width=100vw\"\u003e\n\n\u003cbr\u003e\n\n## https://github.com/SeldonIO/Alibi\n\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n\u003chr\u003e\n\n# Model Evaluation Metrics: White / Black Box\n\n![](images/whiteblackbox.jpg)\n\n# Model Evaluation Metrics: Global vs Local\n\n![](images/globallocal.jpg)\n\n## 2.1) Standard model evaluation metrics\n\n\n```python\n# Let's start by building our model with our newly balanced dataset\nmodel = build_model(X)\nmodel.fit(f_in(X_train), y_train, epochs=20, batch_size=512, shuffle=True, validation_data=(f_in(X_valid), y_valid), callbacks=[PlotLossesKeras()], verbose=0, validation_split=0.05,)\nprobabilities = model.predict(f_in(X_valid))\npred = f_out(probabilities)\n```\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_56_0.png)\n\n\n    Log-loss (cost function):\n    training   (min:    0.311, max:    0.581, cur:    0.311)\n    validation (min:    0.312, max:    0.464, cur:    0.312)\n    \n    Accuracy:\n    training   (min:    0.724, max:    0.856, cur:    0.856)\n    validation (min:    0.808, max:    0.857, cur:    0.857)\n\n\n\n```python\nxai.confusion_matrix_plot(y_valid, pred)\n```\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_57_0.png)\n\n\n\n```python\nim = xai.roc_plot(y_valid, pred)\n```\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_58_0.png)\n\n\n\n```python\nim = xai.roc_plot(y_valid, pred, df=X_valid, cross_cols=[\"gender\"], categorical_cols=[\"gender\"])\n```\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_59_0.png)\n\n\n\n```python\nim = xai.metrics_plot(y_valid, pred)\n```\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_60_0.png)\n\n\n\n```python\nim = xai.metrics_plot(y_valid, pred, df=X_valid, cross_cols=[\"gender\"], categorical_cols=\"gender\")\n```\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_61_0.png)\n\n\n# 2.2) Global black box model evalutaion metrics\n\n\n```python\nimp = xai.feature_importance(X_valid, y_valid, lambda x, y: model.evaluate(f_in(x), y, verbose=0)[1], repeat=1)\n```\n\n\n\n\n![png](Bias%20Evaluation%20%26%20Explainability_files/Bias%20Evaluation%20%26%20Explainability_63_1.png)\n\n\n# 2.3) Local black box model evaluation metrics\n### Overview of methods\n\n![](images/alibi-table.jpg)\n\n# Anchors \n\n\u003cbr\u003e\n\n#### Consists of if-then rules, called the anchors, which sufficiently guarantee the explanation locally and try to maximize the area for which the explanation holds. (ArXiv: Anchors: High-Precision Model-Agnostic Explanations)\n\n\u003cbr\u003e\n\n\u003cdiv style=\"float: left; width: 50%\"\u003e\n\u003cimg src=\"images/textanchor.jpg\"\u003e\n\u003c/div\u003e\n\n\u003cdiv style=\"float: left; width: 50%\"\u003e\n\u003cimg src=\"images/anchorimage.jpg\"\u003e\n\u003c/div\u003e\n\n\n```python\nfrom alibi.explainers import AnchorTabular\n\nexplainer = AnchorTabular(\n    loan_model_alibi.predict, \n    feature_names_alibi, \n    categorical_names=category_map_alibi)\n\nexplainer.fit(\n    X_train_alibi, \n    disc_perc=[25, 50, 75])\n\nprint(\"Explainer built\")\n```\n\n    Explainer built\n\n\n\n```python\nX_test_alibi[:1]\n```\n\n\n\n\n    array([[52,  4,  0,  2,  8,  4,  2,  0,  0,  0, 60,  9]])\n\n\n\n\n```python\nexplanation = explainer.explain(X_test_alibi[:1], threshold=0.95)\n\nprint('Anchor: %s' % (' AND '.join(explanation['names'])))\nprint('Precision: %.2f' % explanation['precision'])\nprint('Coverage: %.2f' % explanation['coverage'])\n```\n\n    Anchor: Marital Status = Separated AND Sex = Female AND Capital Gain \u003c= 0.00\n    Precision: 0.97\n    Coverage: 0.10\n\n\n# Counterfactual Explanations\n\n### The counterfactual explanation of an outcome or a situation Y takes the form “If X had not occured, Y would not have occured” \n\n![](images/counterfactuals7.jpg)\n\n## 1.5 Shoutout to other tools and techniques\nhttps://github.com/EthicalML/awesome-production-machine-learning#explaining-black-box-models-and-datasets\n![](images/modevallibs.jpg)\n\n# 3) Production Monitoring\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n\u003chr\u003e\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n#### Key points to cover\n\n\u003cbr\u003e\n\n1) Design patterns for explainers\n\n\u003cbr\u003e\n\n2) Live demo of explainers\n\n\u003cbr\u003e\n\n3) Leveraging humans for explainers\n\n# Seldon Core - Production ML in K8s\n\n\u003cbr\u003e\n\n## A language agnostic ML serving \u0026 monitoring framework in Kubernetes\n\n\u003cbr\u003e\n\n\u003cimg src=\"images/seldon-core-repo.jpg\" style=\"width=100vw\"\u003e\n\n\u003cbr\u003e\n\n## https://github.com/SeldonIO/seldon-core\n\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n\u003chr\u003e\n\n# 3.1) Design patterns for explainers\n\n![](images/deployment-overview.jpg)\n\n#### Setup Seldon in your kubernetes cluster\n\n\n```bash\n%%bash\nkubectl create clusterrolebinding kube-system-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default\nhelm init\nkubectl rollout status deploy/tiller-deploy -n kube-system\nhelm install seldon-core-operator --name seldon-core-operator --repo https://storage.googleapis.com/seldon-charts\nhelm install seldon-core-analytics --name seldon-core-analytics --repo https://storage.googleapis.com/seldon-charts\nhelm install stable/ambassador --name ambassador\n```\n\n\n```python\nfrom sklearn.preprocessing import LabelEncoder, StandardScaler, OneHotEncoder\nfrom sklearn.impute import SimpleImputer\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.compose import ColumnTransformer\n\n# feature transformation pipeline\nordinal_features = [x for x in range(len(alibi_feature_names)) if x not in list(alibi_category_map.keys())]\nordinal_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')),\n                                      ('scaler', StandardScaler())])\n\ncategorical_features = list(alibi_category_map.keys())\ncategorical_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')),\n                                          ('onehot', OneHotEncoder(handle_unknown='ignore'))])\n\npreprocessor = ColumnTransformer(transformers=[('num', ordinal_transformer, ordinal_features),\n                                               ('cat', categorical_transformer, categorical_features)])\n```\n\n\n```python\npreprocessor.fit(alibi_data)\n```\n\n\n```python\nfrom sklearn.ensemble import RandomForestClassifier\n\nnp.random.seed(0)\nclf = RandomForestClassifier(n_estimators=50)\nclf.fit(preprocessor.transform(X_train_alibi), y_train_alibi)\n```\n\n\n```python\n!mkdir -p pipeline/pipeline_steps/loanclassifier/\n```\n\n#### Save the model artefacts so we can deploy them\n\n\n```python\nimport dill\n\nwith open(\"pipeline/pipeline_steps/loanclassifier/preprocessor.dill\", \"wb\") as prep_f:\n    dill.dump(preprocessor, prep_f)\n    \nwith open(\"pipeline/pipeline_steps/loanclassifier/model.dill\", \"wb\") as model_f:\n    dill.dump(clf, model_f)\n```\n\n#### Build a Model wrapper that uses the trained models through a predict function\n\n\n```python\n%%writefile pipeline/pipeline_steps/loanclassifier/Model.py\nimport dill\n\nclass Model:\n    def __init__(self, *args, **kwargs):\n        \n        with open(\"preprocessor.dill\", \"rb\") as prep_f:\n            self.preprocessor = dill.load(prep_f)\n        with open(\"model.dill\", \"rb\") as model_f:\n            self.clf = dill.load(model_f)\n        \n    def predict(self, X, feature_names=[]):\n        X_prep = self.preprocessor.transform(X)\n        proba = self.clf.predict_proba(X_prep)\n        return proba\n```\n\n#### Add the dependencies for the wrapper to work\n\n\n```python\n%%writefile pipeline/pipeline_steps/loanclassifier/requirements.txt\nscikit-learn==0.20.1\ndill==0.2.9\nscikit-image==0.15.0\nscikit-learn==0.20.1\nscipy==1.1.0\nnumpy==1.15.4\n```\n\n\n```python\n!mkdir pipeline/pipeline_steps/loanclassifier/.s2i\n```\n\n\n```python\n%%writefile pipeline/pipeline_steps/loanclassifier/.s2i/environment\nMODEL_NAME=Model\nAPI_TYPE=REST\nSERVICE_TYPE=MODEL\nPERSISTENCE=0\n```\n\n#### Use the source2image command to containerize code\n\n\n```python\n!s2i build pipeline/pipeline_steps/loanclassifier seldonio/seldon-core-s2i-python3:0.8 loanclassifier:0.1\n```\n\n#### Define the graph of your pipeline with individual models\n\n\n```python\n%%writefile pipeline/pipeline_steps/loanclassifier/loanclassifiermodel.yaml\napiVersion: machinelearning.seldon.io/v1alpha2\nkind: SeldonDeployment\nmetadata:\n  labels:\n    app: seldon\n  name: loanclassifier\nspec:\n  name: loanclassifier\n  predictors:\n  - componentSpecs:\n    - spec:\n        containers:\n        - image: loanclassifier:0.1\n          name: model\n    graph:\n      children: []\n      name: model\n      type: MODEL\n      endpoint:\n        type: REST\n    name: loanclassifier\n    replicas: 1\n```\n\n#### Deploy your model!\n\n\n```python\n!kubectl apply -f pipeline/pipeline_steps/loanclassifier/loanclassifiermodel.yaml\n```\n\n#### Now we can send data through the REST API\n\n\n```python\nX_test_alibi[:1]\n```\n\n\n\n\n    array([[52,  4,  0,  2,  8,  4,  2,  0,  0,  0, 60,  9]])\n\n\n\n\n```bash\n%%bash\ncurl -X POST -H 'Content-Type: application/json' \\\n    -d \"{'data': {'names': ['text'], 'ndarray': [[52,  4,  0,  2,  8,  4,  2,  0,  0,  0, 60,  9]]}}\" \\\n    http://localhost:80/seldon/default/loanclassifier/api/v0.1/predictions\n```\n\n    {\n      \"meta\": {\n        \"puid\": \"96cmdkc4k1c6oassvpnpasqbgf\",\n        \"tags\": {\n        },\n        \"routing\": {\n        },\n        \"requestPath\": {\n          \"model\": \"loanclassifier:0.1\"\n        },\n        \"metrics\": []\n      },\n      \"data\": {\n        \"names\": [\"t:0\", \"t:1\"],\n        \"ndarray\": [[0.86, 0.14]]\n      }\n    }\n\n      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                     Dload  Upload   Total   Spent    Left  Speed\n    100   356  100   264  100    92  11000   3833 --:--:-- --:--:-- --:--:-- 15478\n\n\n#### We can also reach it with the Python Client\n\n\n```python\nfrom seldon_core.seldon_client import SeldonClient\n\nbatch = X_test_alibi[:1]\n\nsc = SeldonClient(\n    gateway=\"ambassador\", \n    gateway_endpoint=\"localhost:80\",\n    deployment_name=\"loanclassifier\",\n    payload_type=\"ndarray\",\n    namespace=\"default\",\n    transport=\"rest\")\n\nclient_prediction = sc.predict(data=batch)\n\nprint(client_prediction.response)\n```\n\n    meta {\n      puid: \"hv4dnmr8m3ckgrhtnc48rs7mjg\"\n      requestPath {\n        key: \"model\"\n        value: \"loanclassifier:0.1\"\n      }\n    }\n    data {\n      names: \"t:0\"\n      names: \"t:1\"\n      ndarray {\n        values {\n          list_value {\n            values {\n              number_value: 0.86\n            }\n            values {\n              number_value: 0.14\n            }\n          }\n        }\n      }\n    }\n    \n\n\n#### Now we can create an explainer for our model\n\n\n```python\nfrom alibi.explainers import AnchorTabular\n\npredict_fn = lambda x: clf.predict(preprocessor.transform(x))\nexplainer = AnchorTabular(predict_fn, alibi_feature_names, categorical_names=alibi_category_map)\nexplainer.fit(X_train_alibi, disc_perc=[25, 50, 75])\n\nexplanation = explainer.explain(X_test_alibi[0], threshold=0.95)\n\nprint('Anchor: %s' % (' AND '.join(explanation['names'])))\nprint('Precision: %.2f' % explanation['precision'])\nprint('Coverage: %.2f' % explanation['coverage'])\n```\n\n    Anchor: Marital Status = Separated AND Sex = Female AND Capital Gain \u003c= 0.00\n    Precision: 0.97\n    Coverage: 0.10\n\n\n\n```python\n\ndef predict_remote_fn(X):\n    from seldon_core.seldon_client import SeldonClient\n    from seldon_core.utils import get_data_from_proto\n    \n    kwargs = {\n        \"gateway\": \"ambassador\", \n        \"deployment_name\": \"loanclassifier\",\n        \"payload_type\": \"ndarray\",\n        \"namespace\": \"default\",\n        \"transport\": \"rest\"\n    }\n    \n    try:\n        kwargs[\"gateway_endpoint\"] = \"localhost:80\"\n        sc = SeldonClient(**kwargs)\n        prediction = sc.predict(data=X)\n    except:\n        # If we are inside the container, we need to reach the ambassador service directly\n        kwargs[\"gateway_endpoint\"] = \"ambassador:80\"\n        sc = SeldonClient(**kwargs)\n        prediction = sc.predict(data=X)\n    \n    y = get_data_from_proto(prediction.response)\n    return y\n\n```\n\n#### But now we can use the remote model we have in production\n\n\n```python\n# Summary of the predict_remote_fn\ndef predict_remote_fn(X):\n    ....\n    sc = SeldonClient(...)\n    prediction = sc.predict(data=X)\n    y = get_data_from_proto(prediction.response)\n    return y\n```\n\n#### And train our explainer to use the remote function\n\n\n```python\nfrom seldon_core.utils import get_data_from_proto\n\nexplainer = AnchorTabular(predict_remote_fn, alibi_feature_names, categorical_names=alibi_category_map)\nexplainer.fit(X_train_alibi, disc_perc=[25, 50, 75])\n\nexplanation = explainer.explain(X_test_alibi[idx], threshold=0.95)\n\nprint('Anchor: %s' % (' AND '.join(explanation['names'])))\nprint('Precision: %.2f' % explanation['precision'])\nprint('Coverage: %.2f' % explanation['coverage'])\n```\n\n    Anchor: Marital Status = Separated AND Sex = Female\n    Precision: 0.97\n    Coverage: 0.11\n\n\n#### To containerise our explainer, save the trained binary\n\n\n\n```python\nimport dill\n\nwith open(\"pipeline/pipeline_steps/loanclassifier-explainer/explainer.dill\", \"wb\") as x_f:\n    dill.dump(explainer, x_f)\n```\n\n#### Expose it through a wrapper\n\n\n```python\n%%writefile pipeline/pipeline_steps/loanclassifier-explainer/Explainer.py\nimport dill\nimport json\nimport numpy as np\n\nclass Explainer:\n    def __init__(self, *args, **kwargs):\n        \n        with open(\"explainer.dill\", \"rb\") as x_f:\n            self.explainer = dill.load(x_f)\n        \n    def predict(self, X, feature_names=[]):\n        print(\"Received: \" + str(X))\n        explanation = self.explainer.explain(X)\n        print(\"Predicted: \" + str(explanation))\n        return json.dumps(explanation, cls=NumpyEncoder)\n\n    \n    \n    \nclass NumpyEncoder(json.JSONEncoder):\n    def default(self, obj):\n        if isinstance(obj, (\n        np.int_, np.intc, np.intp, np.int8, np.int16, np.int32, np.int64, np.uint8, np.uint16, np.uint32, np.uint64)):\n            return int(obj)\n        elif isinstance(obj, (np.float_, np.float16, np.float32, np.float64)):\n            return float(obj)\n        elif isinstance(obj, (np.ndarray,)):\n            return obj.tolist()\n        return json.JSONEncoder.default(self, obj)\n```\n\n#### Add config files to build image with script\n\n\n```python\n!s2i build pipeline/pipeline_steps/loanclassifier-explainer seldonio/seldon-core-s2i-python3:0.8 loanclassifier-explainer:0.1\n```\n\n\n```python\n!mkdir -p pipeline/pipeline_steps/loanclassifier-explainer\n```\n\n\n```python\n%%writefile pipeline/pipeline_steps/loanclassifier-explainer/loanclassifiermodel-explainer.yaml\napiVersion: machinelearning.seldon.io/v1alpha2\nkind: SeldonDeployment\nmetadata:\n  labels:\n    app: seldon\n  name: loanclassifier-explainer\nspec:\n  name: loanclassifier-explainer\n  predictors:\n  - componentSpecs:\n    - spec:\n        containers:\n        - image: loanclassifier-explainer:0.1\n          name: model-explainer\n    graph:\n      children: []\n      name: model-explainer\n      type: MODEL\n      endpoint:\n        type: REST\n    name: loanclassifier-explainer\n    replicas: 1\n```\n\n#### Deploy your remote explainer\n\n\n```python\n!kubectl apply -f pipeline/pipeline_steps/loanclassifier-explainer/loanclassifiermodel-explainer.yaml\n```\n\n#### Now we can request explanations throught the REST API\n\n\n```bash\n%%bash\ncurl -X POST -H 'Content-Type: application/json' \\\n    -d \"{'data': {'names': ['text'], 'ndarray': [[52,  4,  0,  2,  8,  4,  2,  0,  0,  0, 60, 9]] }}\" \\\n    http://localhost:80/seldon/default/loanclassifier-explainer/api/v0.1/predictions\n```\n\n    {\n      \"meta\": {\n        \"puid\": \"ohbll5bcpu9gg7jjj1unll4155\",\n        \"tags\": {\n        },\n        \"routing\": {\n        },\n        \"requestPath\": {\n          \"model-explainer\": \"loanclassifier-explainer:0.1\"\n        },\n        \"metrics\": []\n      },\n      \"strData\": \"{\\\"names\\\": [\\\"Marital Status = Separated\\\", \\\"Sex = Female\\\"], \\\"precision\\\": 0.9629629629629629, \\\"coverage\\\": 0.1078, \\\"raw\\\": {\\\"feature\\\": [3, 7], \\\"mean\\\": [0.9002808988764045, 0.9629629629629629], \\\"precision\\\": [0.9002808988764045, 0.9629629629629629], \\\"coverage\\\": [0.1821, 0.1078], \\\"examples\\\": [{\\\"covered\\\": [[46, 4, 4, 2, 2, 1, 4, 1, 0, 0, 45, 9], [24, 4, 1, 2, 6, 3, 2, 1, 0, 0, 40, 9], [39, 4, 4, 2, 4, 1, 4, 1, 4650, 0, 44, 9], [40, 4, 0, 2, 5, 4, 4, 0, 0, 0, 32, 9], [39, 4, 1, 2, 8, 0, 4, 1, 3103, 0, 50, 9], [45, 4, 1, 2, 6, 5, 4, 0, 0, 0, 42, 9], [41, 4, 1, 2, 5, 1, 4, 1, 0, 0, 40, 9], [40, 4, 4, 2, 2, 0, 4, 1, 0, 0, 40, 9], [58, 4, 3, 2, 2, 2, 4, 0, 0, 0, 45, 5], [23, 4, 1, 2, 5, 1, 4, 1, 0, 0, 50, 9]], \\\"covered_true\\\": [[33, 4, 4, 2, 2, 0, 4, 1, 0, 0, 40, 9], [70, 0, 4, 2, 0, 0, 4, 1, 0, 0, 10, 9], [66, 0, 4, 2, 0, 0, 4, 1, 0, 0, 30, 9], [37, 1, 1, 2, 8, 2, 4, 0, 0, 0, 50, 9], [32, 4, 5, 2, 6, 5, 4, 0, 0, 0, 45, 9], [24, 4, 4, 2, 7, 1, 4, 1, 0, 0, 40, 9], [46, 7, 6, 2, 5, 1, 4, 0, 0, 1564, 55, 9], [28, 4, 4, 2, 2, 3, 4, 0, 0, 0, 40, 9], [28, 4, 4, 2, 2, 0, 4, 1, 3411, 0, 40, 9], [45, 4, 0, 2, 2, 0, 4, 1, 0, 0, 40, 9]], \\\"covered_false\\\": [[51, 4, 6, 2, 5, 1, 4, 0, 0, 2559, 50, 9], [35, 4, 1, 2, 5, 0, 4, 1, 0, 0, 48, 9], [48, 4, 5, 2, 5, 0, 4, 1, 0, 0, 40, 9], [41, 4, 5, 2, 8, 0, 4, 1, 0, 1977, 65, 9], [51, 6, 5, 2, 8, 4, 4, 1, 25236, 0, 50, 9], [46, 4, 4, 2, 2, 0, 4, 1, 0, 0, 75, 9], [52, 6, 1, 2, 1, 5, 4, 0, 99999, 0, 30, 9], [55, 2, 5, 2, 8, 0, 4, 1, 0, 0, 55, 9], [46, 4, 3, 2, 5, 4, 0, 1, 0, 0, 40, 9], [39, 4, 6, 2, 8, 5, 4, 0, 15024, 0, 47, 9]], \\\"uncovered_true\\\": [], \\\"uncovered_false\\\": []}, {\\\"covered\\\": [[52, 4, 4, 2, 1, 4, 4, 0, 0, 1741, 38, 9], [38, 4, 4, 2, 1, 3, 4, 0, 0, 0, 40, 9], [53, 4, 5, 2, 5, 4, 4, 0, 0, 1876, 38, 9], [54, 4, 4, 2, 8, 1, 4, 0, 0, 0, 43, 9], [43, 2, 1, 2, 5, 4, 4, 0, 0, 625, 40, 9], [27, 1, 4, 2, 8, 4, 2, 0, 0, 0, 40, 9], [47, 4, 4, 2, 1, 1, 4, 0, 0, 0, 35, 9], [54, 4, 4, 2, 8, 4, 4, 0, 0, 0, 40, 3], [43, 4, 4, 2, 8, 1, 4, 0, 0, 0, 50, 9], [53, 4, 4, 2, 5, 1, 4, 0, 0, 0, 40, 9]], \\\"covered_true\\\": [[54, 4, 4, 2, 8, 4, 4, 0, 0, 0, 40, 3], [41, 4, 4, 2, 1, 4, 4, 0, 0, 0, 40, 9], [58, 4, 4, 2, 1, 1, 4, 0, 0, 0, 40, 9], [36, 4, 4, 2, 6, 1, 4, 0, 3325, 0, 45, 9], [29, 4, 0, 2, 1, 1, 4, 0, 0, 0, 40, 9], [35, 4, 4, 2, 8, 4, 4, 0, 0, 0, 40, 9], [39, 4, 4, 2, 7, 1, 4, 0, 0, 0, 40, 8], [42, 4, 4, 2, 1, 4, 2, 0, 0, 0, 41, 9], [37, 7, 4, 2, 7, 3, 4, 0, 0, 0, 40, 9], [47, 4, 4, 2, 1, 1, 4, 0, 0, 0, 38, 9]], \\\"covered_false\\\": [[55, 5, 4, 2, 6, 4, 4, 0, 0, 0, 50, 9], [33, 7, 2, 2, 5, 5, 4, 0, 0, 0, 48, 9], [39, 4, 6, 2, 8, 5, 4, 0, 15024, 0, 47, 9], [48, 4, 5, 2, 8, 4, 4, 0, 0, 0, 40, 9], [41, 4, 1, 2, 5, 1, 4, 0, 0, 0, 50, 9], [42, 1, 5, 2, 8, 1, 4, 0, 14084, 0, 60, 9], [51, 4, 6, 2, 5, 1, 4, 0, 0, 2559, 50, 9], [52, 6, 1, 2, 1, 5, 4, 0, 99999, 0, 30, 9], [39, 7, 2, 2, 5, 1, 4, 0, 0, 0, 40, 9]], \\\"uncovered_true\\\": [], \\\"uncovered_false\\\": []}], \\\"all_precision\\\": 0, \\\"num_preds\\\": 1000101, \\\"names\\\": [\\\"Marital Status = Separated\\\", \\\"Sex = Female\\\"], \\\"instance\\\": [[52.0, 4.0, 0.0, 2.0, 8.0, 4.0, 2.0, 0.0, 0.0, 0.0, 60.0, 9.0]], \\\"prediction\\\": 0}}\"\n    }\n\n      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                     Dload  Upload   Total   Spent    Left  Speed\n    100  3464  100  3372  100    92   3318     90  0:00:01  0:00:01 --:--:--  3409\n\n\n### Now we have an explainer deployed!\n\n![](images/deployment-overview.jpg)\n\n# Visualise metrics and explanations \n\n![](images/deploy-expl.jpg)\n\n# Leveraging Humans for Explanations\n\n\n\n![](images/smile1.jpg)\n\n![](images/smile2.jpg)\n\n![](images/smile3.jpg)\n\n# Revisiting our workflow\n\n\u003cimg src=\"images/gml.png\" style=\"width=100vw\"\u003e\n\n# Explainability and Bias Evaluation\n\n\u003cbr\u003e\n\u003cbr\u003e\n\u003cbr\u003e\n\u003cbr\u003e\n\u003cbr\u003e\n\u003cbr\u003e\n\n## Alejandro Saucedo\n\u003cbr\u003e\nChief Scientist, The Institute for Ethical AI \u0026 Machine Learning\nDirector of ML Engineering, Seldon Technologie\nDirector of ML Engineering, Seldon Technologiess\n\n\u003cbr\u003e\n\u003cbr\u003e\n\n[github.com/ethicalml/explainability-and-bias](github.com/ethicalml/bias-analysis)\n\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n\u003chr\u003e\n\n\u003cbr\u003e\u003cbr\u003e\u003cbr\u003e\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fethicalml%2Fexplainability-and-bias","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fethicalml%2Fexplainability-and-bias","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fethicalml%2Fexplainability-and-bias/lists"}