{"id":13671183,"url":"https://github.com/suinleelab/path_explain","last_synced_at":"2025-12-26T01:42:49.273Z","repository":{"id":45245728,"uuid":"239634233","full_name":"suinleelab/path_explain","owner":"suinleelab","description":"A repository for explaining feature attributions and feature interactions in deep neural networks.","archived":false,"fork":false,"pushed_at":"2022-01-16T22:08:55.000Z","size":214257,"stargazers_count":187,"open_issues_count":7,"forks_count":29,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-04-20T13:56:11.630Z","etag":null,"topics":["explainable-ai","interpretable-deep-learning","machine-learning","pytorch","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/suinleelab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-02-10T23:21:15.000Z","updated_at":"2025-03-21T20:23:28.000Z","dependencies_parsed_at":"2022-09-26T17:31:25.876Z","dependency_job_id":null,"html_url":"https://github.com/suinleelab/path_explain","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/suinleelab%2Fpath_explain","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/suinleelab%2Fpath_explain/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/suinleelab%2Fpath_explain/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/suinleelab%2Fpath_explain/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/suinleelab","download_url":"https://codeload.github.com/suinleelab/path_explain/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251154374,"owners_count":21544490,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["explainable-ai","interpretable-deep-learning","machine-learning","pytorch","tensorflow"],"created_at":"2024-08-02T09:01:02.230Z","updated_at":"2025-12-26T01:42:49.268Z","avatar_url":"https://github.com/suinleelab.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook","Python Libraries(sort in alphabeta order)"],"sub_categories":["Evaluation methods"],"readme":"# Path Explain\n\nA repository for explaining feature importances and feature interactions in deep neural networks using path attribution methods.\n\nThis repository contains tools to interpret and explain machine learning models using [Integrated Gradients](https://arxiv.org/abs/1703.01365) and [Expected Gradients](https://arxiv.org/abs/1906.10670). In addition, it contains code to explain _interactions_ in deep networks using Integrated Hessians and Expected Hessians - methods that we introduced in our most recent paper: [\"Explaining Explanations: Axiomatic Feature Interactions for Deep Networks\"](https://www.jmlr.org/papers/v22/20-1223.html). If you use our work to explain your networks, please cite this paper.\n\n```\n@article{janizek2020explaining,\n  author  = {Joseph D. Janizek and Pascal Sturmfels and Su-In Lee},\n  title   = {Explaining Explanations: Axiomatic Feature Interactions for Deep Networks},\n  journal = {Journal of Machine Learning Research},\n  year    = {2021},\n  volume  = {22},\n  number  = {104},\n  pages   = {1-54},\n  url     = {http://jmlr.org/papers/v22/20-1223.html}\n}\n```\n\nThis repository contains two important directories: the `path_explain` directory, which contains the packages used to interpret and explain machine learning models, and the `examples` directory, which contains many examples using the `path_explain` module to explain different models on different data types.\n\n## Installation\n\nThe easiest way to install this package is by using pip:\n```\npip install path-explain\n```\nAlternatively, you can clone this repository to re-run and explore the examples provided.\n\n## Compatibility\nThis package was written to support TensorFlow 2.0 (in eager execution mode) with Python 3. We have no current plans to support earlier versions of TensorFlow or Python.\n\n## API\nAlthough we don't yet have formal API documentation, the underlying code does a pretty good job at explaining the API. See the code for generating [attributions](https://github.com/suinleelab/path_explain/blob/master/path_explain/explainers/path_explainer_tf.py#L302) and [interactions](https://github.com/suinleelab/path_explain/blob/master/path_explain/explainers/path_explainer_tf.py#L445) to better understand what the arguments to these functions mean.\n\n## Examples\n\nFor a simple, quick example to get started using this repository, see the `example_usage.ipynb` notebook in the top-level directory of this repository. It gives an overview of the functionality provided by this repository. For more advanced examples, keep reading on.\n\n### Tabular Data using Expected Gradients and Expected Hessians\n\nOur repository can easily be adapted to explain attributions and interactions learned on tabular data.\n```python\n# other import statements...\nfrom path_explain import PathExplainerTF, scatter_plot, summary_plot\n\n### Code to train a model would go here\nx_train, y_train, x_test, y_test = datset()\nmodel = ...\nmodel.fit(x_train, y_train, ...)\n###\n\n### Generating attributions using expected gradients\nexplainer = PathExplainerTF(model)\nattributions = explainer.attributions(inputs=x_test,\n                                      baseline=x_train,\n                                      batch_size=100,\n                                      num_samples=200,\n                                      use_expectation=True,\n                                      output_indices=0)\n###\n\n### Generating interactions using expected hessians\ninteractions = explainer.interactions(inputs=x_test,\n                                      baseline=x_train,\n                                      batch_size=100,\n                                      num_samples=200,\n                                      use_expectation=True,\n                                      output_indices=0)\n###\n```\n\nOnce we've generated attributions and interactions, we can use the provided plotting modules to help visualize them. First we plot a summary of the top features and their attribution values:\n```python\n### First we need a list of strings denoting the name of each feature\nfeature_names = ...\n###\n\nsummary_plot(attributions=attributions,\n             feature_values=x_test,\n             feature_names=feature_names,\n             plot_top_k=10)\n```\n![Heart Disease Summary Plot](/images/heart_disease.png)\n\nSecond, we plot an interaction our model has learned between maximum achieved heart rate and gender:\n```python\nscatter_plot(attributions=attributions,\n             feature_values=x_test,\n             feature_index='max. achieved heart rate',\n             interactions=interactions,\n             color_by='is male',\n             feature_names=feature_names,\n             scale_y_ind=True)\n```\n![Interaction: Heart Rate and Gender](/images/max_heart_rate.png)\n\nThe model used to generate the above interactions is a two layer neural network trained on the [UCI Heart Disease Dataset](https://archive.ics.uci.edu/ml/datasets/Heart+Disease). Interactions learned by this model were featured in our paper. To learn more about this particular model and the experimental setup, see [the notebook used to train and explain the model](https://github.com/suinleelab/path_explain/blob/master/examples/tabular/heart_disease/attributions.ipynb).\n\n\n### Explaining an NLP model using Integrated Gradients and Integrated Hessians\nAs discussed in our paper, we can use Integrated Hessians to get interactions in language models. We explain a transformer from the [HuggingFace Transformers Repository](https://github.com/huggingface/transformers).\n```python\nfrom transformers import DistilBertTokenizer, TFDistilBertForSequenceClassification, \\\n                         DistilBertConfig, glue_convert_examples_to_features, \\\n                         glue_processors\n\n# This is a custom explainer to explain huggingface models\nfrom path_explain import EmbeddingExplainerTF, text_plot, matrix_interaction_plot, bar_interaction_plot\n\ntokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')\nconfig = DistilBertConfig.from_pretrained('distilbert-base-uncased', num_labels=num_labels)\nmodel = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased', config=config)\n\n### Some custom code to fine-tune the model on a sentiment analysis task...\nmax_length = 128\ndata, info = tensorflow_datasets.load('glue/sst-2', with_info=True)\ntrain_dataset = glue_convert_examples_to_features(data['train'],\n                                                  tokenizer,\n                                                  max_length,\n                                                  'sst-2)\nvalid_dataset = glue_convert_examples_to_features(data['validation'],\n                                                  tokenizer,\n                                                  max_length,\n                                                  'sst-2')\n...\n### we won't include the whole fine-tuning code. See the HuggingFace repository for more.\n\n### Here we define functions that represent two pieces of the model:\n### embedding and prediction\ndef embedding_model(batch_ids):\n    batch_embedding = model.distilbert.embeddings(batch_ids)\n    return batch_embedding\n\ndef prediction_model(batch_embedding):\n    # Note: this isn't exactly the right way to use the attention mask.\n    # It should actually indicate which words are real words. This\n    # makes the coding easier however, and the output is fairly similar,\n    # so it suffices for this tutorial.\n    attention_mask = tf.ones(batch_embedding.shape[:2])\n    attention_mask = tf.cast(attention_mask, dtype=tf.float32)\n    head_mask = [None] * model.distilbert.num_hidden_layers\n\n    transformer_output = model.distilbert.transformer([batch_embedding, attention_mask, head_mask], training=False)[0]\n    pooled_output = transformer_output[:, 0]\n    pooled_output = model.pre_classifier(pooled_output)\n    logits = model.classifier(pooled_output)\n    return logits\n###\n\n### We need some data to explain\nfor batch in valid_dataset.take(1):\n    batch_input = batch[0]\n\nbatch_ids = batch_input['input_ids']\nbatch_embedding = embedding_model(batch_ids)\n\nbaseline_ids = np.zeros((1, 128), dtype=np.int64)\nbaseline_embedding = embedding_model(baseline_ids)\n###\n\n### We are finally ready to explain our model\nexplainer = EmbeddingExplainerTF(prediction_model)\nattributions = explainer.attributions(inputs=batch_embedding,\n                                      baseline=baseline_embedding,\n                                      batch_size=32,\n                                      num_samples=256,\n                                      use_expectation=False,\n                                      output_indices=1)\n###\n\n### For interactions, the hessian is rather large so we use a very small batch size\ninteractions = explainer.interactions(inputs=batch_embedding,\n                                      baseline=baseline_embedding,\n                                      batch_size=1,\n                                      num_samples=256,\n                                      use_expectation=False,\n                                      output_indices=1)\n###\n```\nWe can plot the learned attributions and interactions as follows. First we plot the attributions:\n```python\n### First we need to decode the tokens from the batch ids.\nbatch_sentences = ...\n### Doing so will depend on how you tokenized your model!\n\ntext_plot(batch_sentences[0],\n          attributions[0],\n          include_legend=True)\n```\n![Showing feature attributions in text](/images/little_to_love_text.png)\n\nThen we plot the interactions:\n```python\nbar_interaction_plot(interactions[0],\n                     batch_sentences[0],\n                     top_k=5)\n```\n![Showing feature interactions in text](/images/little_to_love_bar.png)\n\nIf you would rather plot the full matrix of attributions rather than the top interactions in a bar plot, our package also supports this. First we show the attributions:\n```python\ntext_plot(batch_sentences[1],\n          attributions[1],\n          include_legend=True)\n```\n![Showing additional attributions](/images/painfully_funny_text.png)\n\nAnd then we show the full interaction matrix. Here we've zeroed out the diagonals so you can better see the off-diagonal terms.\n```python\nmatrix_interaction_plot(interaction_list[1],\n                        token_list[1])\n```\n![Showing the full matrix of feature interactions](/images/painfully_funny_matrix.png)\n\nThis example - interpreting [DistilBERT](https://arxiv.org/abs/1910.01108) - was also featured in our paper. You can examine the setup more [here](https://github.com/suinleelab/path_explain/tree/master/examples/natural_language/transformers). For more examples, see the `examples` directory in this repository.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsuinleelab%2Fpath_explain","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsuinleelab%2Fpath_explain","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsuinleelab%2Fpath_explain/lists"}