{"id":13472371,"url":"https://github.com/inseq-team/inseq","last_synced_at":"2026-02-02T08:01:27.992Z","repository":{"id":64558561,"uuid":"406341006","full_name":"inseq-team/inseq","owner":"inseq-team","description":"Interpretability for sequence generation models 🐛 🔍","archived":false,"fork":false,"pushed_at":"2024-11-10T08:03:50.000Z","size":8029,"stargazers_count":397,"open_issues_count":25,"forks_count":36,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-02-09T20:01:56.929Z","etag":null,"topics":["attribution-methods","captum","deep-learning","explainable-ai","generative-ai","huggingface","interpretability","language-generation","language-model","large-language-models","natural-language-processing","sequence-to-sequence","transformers"],"latest_commit_sha":null,"homepage":"https://inseq.org","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/inseq-team.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-09-14T11:34:46.000Z","updated_at":"2025-02-07T14:35:20.000Z","dependencies_parsed_at":"2022-12-10T18:46:19.715Z","dependency_job_id":"476aa54a-0abb-4497-9b7f-9c5465797d97","html_url":"https://github.com/inseq-team/inseq","commit_stats":{"total_commits":223,"total_committers":6,"mean_commits":"37.166666666666664","dds":0.5515695067264574,"last_synced_commit":"22848bd8640e478a23e5bd40e4f3902fa2111016"},"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inseq-team%2Finseq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inseq-team%2Finseq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inseq-team%2Finseq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inseq-team%2Finseq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/inseq-team","download_url":"https://codeload.github.com/inseq-team/inseq/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245681440,"owners_count":20655198,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["attribution-methods","captum","deep-learning","explainable-ai","generative-ai","huggingface","interpretability","language-generation","language-model","large-language-models","natural-language-processing","sequence-to-sequence","transformers"],"created_at":"2024-07-31T16:00:54.138Z","updated_at":"2026-02-02T08:01:27.977Z","avatar_url":"https://github.com/inseq-team.png","language":"Python","funding_links":[],"categories":["Table of Contents","Explainability, counterfactuals and probing","Tools"],"sub_categories":["LLM Interpretability Tools","Interpretability/Explicability"],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/inseq-team/inseq/main/docs/source/images/inseq_logo.png\" width=\"300\"/\u003e\n  \u003ch4\u003eIntepretability for Sequence Generation Models 🔍\u003c/h4\u003e\n\u003c/div\u003e\n\u003cbr/\u003e\n\u003cdiv align=\"center\"\u003e\n\n\n[![Build status](https://img.shields.io/github/actions/workflow/status/inseq-team/inseq/build.yml?branch=main)](https://github.com/inseq-team/inseq/actions?query=workflow%3Abuild)\n[![Docs status](https://img.shields.io/readthedocs/inseq)](https://inseq.readthedocs.io)\n[![Version](https://img.shields.io/pypi/v/inseq?color=blue)](https://pypi.org/project/inseq/)\n[![Python Version](https://img.shields.io/pypi/pyversions/inseq.svg?color=blue)](https://pypi.org/project/inseq/)\n[![Downloads](https://static.pepy.tech/badge/inseq)](https://pepy.tech/project/inseq)\n[![License](https://img.shields.io/github/license/inseq-team/inseq)](https://github.com/inseq-team/inseq/blob/main/LICENSE)\n[![Demo Paper](https://img.shields.io/badge/ACL%20Anthology%20-%20?logo=data%3Aimage%2Fx-icon%3Bbase64%2CAAABAAEAIBIAAAEAIABwCQAAFgAAACgAAAAgAAAAJAAAAAEAIAAAAAAAAAkAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACEa7k0mH%2B%2F5JBzt%2FyQc7f8kHO3%2FJBzt%2FyQc7f8kHO3%2FJBzt%2FyQc7f8kHO3%2FJB3t%2FyQd7f8kHe3%2FJBzt%2FyQc7f8kHO3%2FJBzt%2FyQd7f8kHO3%2FJB3t%2FyQd7f8kHO3%2FJB3t%2FyQd7f8kHe3%2FJB3t%2FyMc79EkGP8VAAAAAAAAAAAAAAAAIRruTSYf7%2FkkHO3%2FJBzt%2FyQd7f8kHe3%2FJB3t%2FyQd7f8kHO3%2FJB3t%2FyQc7f8kHe3%2FJBzt%2FyQc7f8kHO3%2FJB3t%2FyQc7f8kHO3%2FJBzt%2FyQc7f8kHO3%2FJB3t%2FyQc7f8kHO3%2FJB3t%2FyQd7f8kHe3%2FIxzv0SQY%2FxUAAAAAAAAAAAAAAAAhIe5NJh%2Fv%2BSQd7f8kHe3%2FJB3t%2FyQd7f8kHO3%2FJBzt%2FyQc7f8kHe3%2FJBzt%2FyQd7f8kHe3%2FJBzt%2FyQc7f8kHe3%2FJB3t%2FyQd7f8kHe3%2FJB3t%2FyQc7f8kHe3%2FJB3t%2FyQd7f8kHe3%2FJBzt%2FyQd7f8jHO%2FRJBj%2FFQAAAAAAAAAAAAAAACEa7k0mH%2B%2F5JB3t%2FyQc7f8kHO3%2FJBzt%2FyQc7f8kHe3%2FJBzt%2FyQd7f8kHe3%2FJBzt%2FyQc7f8kHO3%2FJBzt%2FyQc7f8kHe3%2FJBzt%2FyQd7f8kHe3%2FJB3t%2FyQc7f8kHe3%2FJB3t%2FyQc7f8kHO3%2FJBzt%2FyMc79EkGP8VAAAAAAAAAAAAAAAAIRruTSYf7%2FkkHO3%2FJBzt%2FyQc7f8kHe3%2FJB3t%2FyQd7f8kHO3%2FJB3t%2FyQd7f8kHO3%2FJB3t%2FyQc7f8kHe3%2FJBzt%2FyQc7f8kHe3%2FJB3t%2FyQd7f8kHe3%2FJBzt%2FyQd7f8kHO3%2FJB3t%2FyQc7f8kHO3%2FIxzv0SQY%2FxUAAAAAAAAAAAAAAAAhGu5NJh%2Fv%2BSQd7f8kHO3%2FJBzt%2FyQd7f8jIOzYIxvtgiQc8X8kHPF%2FJBzxfyQc8X8kHPF%2FJBzxfyQc8X8kHPF%2FIx%2FuiiMf7OgkHe3%2FJBzt%2FyQc7f8kHO3%2FJhzs9CUg7JYkHPF%2FJBzxfyQc8X8iHfBoMzP%2FCgAAAAAAAAAAAAAAACEa7k0mHu%2F5JBzt%2FyQc7f8kHO3%2FJBzt%2FyQb7LEAAP8FAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAkGP8VIxzv0SQc7f8kHe3%2FJB3t%2FyQc7f8jHOzpIhzuLQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIRruTSYf7%2FkkHe3%2FJB3t%2FyQd7f8kHO3%2FJBvssQAA%2FwUAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACQY%2FxUjHO%2FRJB3t%2FyQc7f8kHO3%2FJBzt%2FyMb7OkiHO4tAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAhGu5NJh%2Fv%2BSQd7f8kHe3%2FJBzt%2FyQc7f8kHuyxAAD%2FBQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAJBj%2FFSMc79EkHe3%2FJBzt%2FyQc7f8kHO3%2FIxvs6SIc7i0AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACEa7k0mH%2B%2F5JBzt%2FyQc7f8kHO3%2FJBzt%2FyQb7LEAAP8FAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAkGP8VIx3v0SQd7f8kHe3%2FJBzt%2FyQc7f8jHOzpIhzuLQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIRruTSYf7%2FkkHO3%2FJBzt%2FyQc7f8kHe3%2FJBzuxSgi81MhGu5NIRruTSEa7k0hGu5NISHuTSEh7k0hGu5NIRruTSIa72EjHe3aJBzt%2FyQd7f8kHO3%2FJBzt%2FyMc7OkiHO4tAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAhGu5NJh7v%2BSQc7f8kHe3%2FJBzt%2FyQc7f8kHO3%2FJh%2Fv%2BSYf7%2FkmHu%2F5Jh%2Fv%2BSYf7%2FkmH%2B%2F5Jh%2Fv%2BSYf7%2FkmH%2B%2F5Jh7v%2BSQc7f8kHO3%2FJBzt%2FyQc7f8kHO3%2FIxzs6SIc7i0AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACEa7k0mHu%2F5JBzt%2FyQd7f8kHO3%2FJBzt%2FyQc7f8kHe3%2FJB3t%2FyQd7f8kHO3%2FJBzt%2FyQc7f8kHe3%2FJB3t%2FyQc7f8kHO3%2FJBzt%2FyQc7f8kHe3%2FJBzt%2FyQd7f8jHOzpIhzuLQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIRruTSYe7%2FkkHO3%2FJB3t%2FyQc7f8kHO3%2FJBzt%2FyQd7f8kHe3%2FJB3t%2FyQc7f8kHe3%2FJB3t%2FyQd7f8kHO3%2FJBzt%2FyQc7f8kHO3%2FJB3t%2FyQc7f8kHO3%2FJBzt%2FyMc7OkiHO4tAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAhGu5NJh%2Fv%2BSQc7f8kHO3%2FJB3t%2FyQc7f8kHO3%2FJBzt%2FyQc7f8kHO3%2FJBzt%2FyQc7f8kHO3%2FJBzt%2FyQc7f8kHO3%2FJBzt%2FyQc7f8kHO3%2FJBzt%2FyQc7f8kHO3%2FIxzs6SIc7i0AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACEa7k0mH%2B%2F5JB3t%2FyQc7f8kHe3%2FJBzt%2FyQc7f8kHO3%2FJB3t%2FyQc7f8kHO3%2FJBzt%2FyQd7f8kHO3%2FJBzt%2FyQc7f8kHO3%2FJB3t%2FyQc7f8kHO3%2FJBzt%2FyQc7f8jHOzpIhzuLQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAKB%2FtOSUc7askHuyxJBvssSQe7LEkHuyxJB7ssSQe7LEkHuyxJBvssSQb7LEkHuyxJB7ssSQe7LEkHuyxJB7ssSUc7LMjHe31JB3t%2FyQd7f8kHe3%2FJBzt%2FyMc7OkiHO4tAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD%2FBQAA%2FwUAAP8FAAD%2FBQAA%2FwUAAP8FAAD%2FBQAA%2FwUAAP8FAAD%2FBQAA%2FwUAAP8FAAD%2FBQAA%2FwUAAP8FHBzsGyMd7qYjHO%2FRIxzv0SMd79EjHO%2FRIx7tux4Y%2BSoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADAAAABwAAAAcAAAAHAAAABwAAAAcAAAAHAP8A%2FwD%2FAP8A%2FwD%2FAP8A%2FwAAAP8AAAD%2FAAAA%2FwAAAP8AAAD%2FAAAA%2FwAAAP%2BAAAD8%3D\u0026labelColor=white\u0026color=red\u0026link=https%3A%2F%2Faclanthology.org%2F2023.acl-demo.40%2F\n)](https://aclanthology.org/2023.acl-demo.40)\n\n\u003c/div\u003e\n\u003cdiv align=\"center\"\u003e\n\n  [![Follow Inseq on Twitter](https://img.shields.io/badge/Twitter-1DA1F2?style=for-the-badge\u0026logo=twitter\u0026logoColor=white)](https://twitter.com/InseqLib)\n  [![Join the Inseq Discord server](https://img.shields.io/badge/Discord-7289DA?style=for-the-badge\u0026logo=discord\u0026logoColor=white)](https://discord.gg/V5VgwwFPbu)\n  [![Read the Docs](https://img.shields.io/badge/-Docs-blue?style=for-the-badge\u0026logo=Read-the-Docs\u0026logoColor=white\u0026link=https://inseq.org)](https://inseq.org)\n  [![Tutorial](https://img.shields.io/badge/-Tutorial-orange?style=for-the-badge\u0026logo=Jupyter\u0026logoColor=white\u0026link=https://github.com/inseq-team/inseq/blob/main/examples/inseq_tutorial.ipynb)](https://github.com/inseq-team/inseq/blob/main/examples/inseq_tutorial.ipynb)\n\n\n\u003c/div\u003e\n\nInseq is a Pytorch-based hackable toolkit to democratize access to common post-hoc **in**terpretability analyses of **seq**uence generation models.\n\n## Installation\n\nInseq is available on PyPI and can be installed with `pip` for Python \u003e= 3.10, \u003c= 3.13:\n\n```bash\n# Install latest stable version\npip install inseq\n\n# Alternatively, install latest development version\npip install git+https://github.com/inseq-team/inseq.git\n```\n\nInstall extras for visualization in Jupyter Notebooks and 🤗 datasets attribution as `pip install inseq[notebook,datasets]`.\n\n\u003cdetails\u003e\n  \u003csummary\u003eDev Installation\u003c/summary\u003e\nTo install the package, clone the repository and run the following commands:\n\n```bash\ncd inseq\nmake uv-download # Download and install the uv package manager\nmake install # Installs the package and all dependencies via uv sync\n```\n\nFor library developers, you can use the `make install-dev` command to install all development dependencies (quality, docs, extras).\n\nAfter installation, you should be able to run `make fast-test` and `make lint` without errors.\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eFAQ Installation\u003c/summary\u003e\n\n- Installing the `tokenizers` package requires a Rust compiler installation. You can install Rust from [https://rustup.rs](https://rustup.rs) and add `$HOME/.cargo/env` to your PATH.\n\n- Installing `sentencepiece` requires various packages, install with `sudo apt-get install cmake build-essential pkg-config` or `brew install cmake gperftools pkg-config`.\n\n\u003c/details\u003e\n\n## Example usage in Python\n\nThis example uses the Integrated Gradients attribution method to attribute the English-French translation of a sentence taken from the WinoMT corpus:\n\n```python\nimport inseq\n\nmodel = inseq.load_model(\"Helsinki-NLP/opus-mt-en-fr\", \"integrated_gradients\")\nout = model.attribute(\n  \"The developer argued with the designer because her idea cannot be implemented.\",\n  n_steps=100\n)\nout.show()\n```\n\nThis produces a visualization of the attribution scores for each token in the input sentence (token-level aggregation is handled automatically). Here is what the visualization looks like inside a Jupyter Notebook:\n\n![WinoMT Attribution Map](https://raw.githubusercontent.com/inseq-team/inseq/main/docs/source/images/heatmap_winomt.png)\n\nInseq also supports decoder-only models such as [GPT-2](https://huggingface.co/transformers/model_doc/gpt2.html), enabling usage of a variety of attribution methods and customizable settings directly from the console:\n\n```python\nimport inseq\n\nmodel = inseq.load_model(\"gpt2\", \"integrated_gradients\")\nmodel.attribute(\n    \"Hello ladies and\",\n    generation_args={\"max_new_tokens\": 9},\n    n_steps=500,\n    internal_batch_size=50\n).show()\n```\n\n![GPT-2 Attribution in the console](https://raw.githubusercontent.com/inseq-team/inseq/main/docs/source/images/inseq_python_console.gif)\n\n## Features\n\n- 🚀 Feature attribution of sequence generation for most `ForConditionalGeneration` (encoder-decoder) and `ForCausalLM` (decoder-only) models from 🤗 Transformers\n\n- 🚀 Support for multiple feature attribution methods, extending the ones supported by [Captum](https://captum.ai/docs/introduction)\n\n- 🚀 Post-processing, filtering and merging of attribution maps via `Aggregator` classes.\n\n- 🚀 Attribution visualization in notebooks, browser and command line.\n\n- 🚀 Efficient attribution of single examples or entire 🤗 datasets with the Inseq CLI.\n\n- 🚀 Custom attribution of target functions, supporting advanced methods such as [contrastive feature attributions](https://aclanthology.org/2022.emnlp-main.14/) and [context reliance detection](https://arxiv.org/abs/2310.01188).\n\n- 🚀 Extraction and visualization of custom scores (e.g. probability, entropy) at every generation step alongsides attribution maps.\n\n### Supported methods\n\nUse the `inseq.list_feature_attribution_methods` function to list all available method identifiers and `inseq.list_step_functions` to list all available step functions. The following methods are currently supported:\n\n#### Gradient-based attribution\n\n- `saliency`: [Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps](https://arxiv.org/abs/1312.6034) (Simonyan et al., 2013)\n\n- `input_x_gradient`: [Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps](https://arxiv.org/abs/1312.6034) (Simonyan et al., 2013)\n\n- `integrated_gradients`: [Axiomatic Attribution for Deep Networks](https://arxiv.org/abs/1703.01365) (Sundararajan et al., 2017)\n\n- `deeplift`: [Learning Important Features Through Propagating Activation Differences](https://arxiv.org/abs/1704.02685) (Shrikumar et al., 2017)\n\n- `gradient_shap`: [A unified approach to interpreting model predictions](https://dl.acm.org/doi/10.5555/3295222.3295230) (Lundberg and Lee, 2017)\n\n- `discretized_integrated_gradients`: [Discretized Integrated Gradients for Explaining Language Models](https://aclanthology.org/2021.emnlp-main.805/) (Sanyal and Ren, 2021)\n\n- `sequential_integrated_gradients`: [Sequential Integrated Gradients: a simple but effective method for explaining language models](https://aclanthology.org/2023.findings-acl.477/) (Enguehard, 2023)\n\n#### Internals-based attribution\n\n- `attention`: Attention Weight Attribution, from [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/abs/1409.0473) (Bahdanau et al., 2014)\n\n#### Perturbation-based attribution\n\n- `occlusion`: [Visualizing and Understanding Convolutional Networks](https://link.springer.com/chapter/10.1007/978-3-319-10590-1_53) (Zeiler and Fergus, 2014)\n\n- `lime`: [\"Why Should I Trust You?\": Explaining the Predictions of Any Classifier](https://arxiv.org/abs/1602.04938) (Ribeiro et al., 2016)\n\n- `value_zeroing`: [Quantifying Context Mixing in Transformers](https://aclanthology.org/2023.eacl-main.245/) (Mohebbi et al. 2023)\n\n- `reagent`: [ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models](https://arxiv.org/abs/2402.00794) (Zhao et al., 2024)\n\n#### Step functions\n\nStep functions are used to extract custom scores from the model at each step of the attribution process with the `step_scores` argument in `model.attribute`. They can also be used as targets for attribution methods relying on model outputs (e.g. gradient-based methods) by passing them as the `attributed_fn` argument. The following step functions are currently supported:\n\n- `logits`: Logits of the target token.\n- `probability`: Probability of the target token. Can also be used for log-probability by passing `logprob=True`.\n- `entropy`: Entropy of the predictive distribution.\n- `crossentropy`: Cross-entropy loss between target token and predicted distribution.\n- `perplexity`: Perplexity of the target token.\n- `contrast_logits`/`contrast_prob`: Logits/probabilities of the target token when different contrastive inputs are provided to the model. Equivalent to `logits`/`probability` when no contrastive inputs are provided.\n- `contrast_logits_diff`/`contrast_prob_diff`: Difference in logits/probability between original and foil target tokens pair, can be used for contrastive evaluation as in [contrastive attribution](https://aclanthology.org/2022.emnlp-main.14/) (Yin and Neubig, 2022).\n- `pcxmi`: Point-wise Contextual Cross-Mutual Information (P-CXMI) for the target token given original and contrastive contexts [(Yin et al. 2021)](https://arxiv.org/abs/2109.07446).\n- `kl_divergence`: KL divergence of the predictive distribution given original and contrastive contexts. Can be restricted to most likely target token options using the `top_k` and `top_p` parameters.\n- `in_context_pvi`: In-context Pointwise V-usable Information (PVI) to measure the amount of contextual information used in model predictions [(Lu et al. 2023)](https://arxiv.org/abs/2310.12300).\n- `mc_dropout_prob_avg`: Average probability of the target token across multiple samples using [MC Dropout](https://arxiv.org/abs/1506.02142) (Gal and Ghahramani, 2016).\n- `top_p_size`: The number of tokens with cumulative probability greater than `top_p` in the predictive distribution of the model.\n\nThe following example computes contrastive attributions using the `contrast_prob_diff` step function:\n\n```python\nimport inseq\n\nattribution_model = inseq.load_model(\"gpt2\", \"input_x_gradient\")\n\n# Perform the contrastive attribution:\n# Regular (forced) target -\u003e \"The manager went home because he was sick\"\n# Contrastive target      -\u003e \"The manager went home because she was sick\"\nout = attribution_model.attribute(\n    \"The manager went home because\",\n    \"The manager went home because he was sick\",\n    attributed_fn=\"contrast_prob_diff\",\n    contrast_targets=\"The manager went home because she was sick\",\n    # We also visualize the corresponding step score\n    step_scores=[\"contrast_prob_diff\"]\n)\nout.show()\n```\n\nRefer to the [documentation](https://inseq.readthedocs.io/examples/custom_attribute_target.html) for an example including custom function registration.\n\n## Using the Inseq CLI\n\nThe Inseq library also provides useful client commands to enable repeated attribution of individual examples and even entire 🤗 datasets directly from the console. See the available options by typing `inseq -h` in the terminal after installing the package.\n\nThree commands are supported:\n\n- `inseq attribute`: Wrapper for enabling `model.attribute` usage in console.\n\n- `inseq attribute-dataset`: Extends `attribute` to full dataset using Hugging Face `datasets.load_dataset` API.\n\n- `inseq attribute-context`: Detects and attribute context dependence for generation tasks using the approach of [Sarti et al. (2023)](https://arxiv.org/abs/2310.01188).\n\nAll commands support the full range of parameters available for `attribute`, attribution visualization in the console and saving outputs to disk.\n\n\u003cdetails\u003e\n  \u003csummary\u003e\u003ccode\u003einseq attribute\u003c/code\u003e example\u003c/summary\u003e\n\n  The following example performs a simple feature attribution of an English sentence translated into Italian using a MarianNMT translation model from \u003ccode\u003etransformers\u003c/code\u003e. The final result is printed to the console.\n  ```bash\n  inseq attribute \\\n  --model_name_or_path Helsinki-NLP/opus-mt-en-it \\\n  --attribution_method saliency \\\n  --input_texts \"Hello world this is Inseq\\! Inseq is a very nice library to perform attribution analysis\"\n  ```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e\u003ccode\u003einseq attribute-dataset\u003c/code\u003e example\u003c/summary\u003e\n\n  The following code can be used to perform attribution (both source and target-side) of Italian translations for a dummy sample of 20 English sentences taken from the FLORES-101 parallel corpus, using a MarianNMT translation model from Hugging Face \u003ccode\u003etransformers\u003c/code\u003e. We save the visualizations in HTML format in the file \u003ccode\u003eattributions.html\u003c/code\u003e. See the \u003ccode\u003e--help\u003c/code\u003e flag for more options.\n\n  ```bash\n  inseq attribute-dataset \\\n    --model_name_or_path Helsinki-NLP/opus-mt-en-it \\\n    --attribution_method saliency \\\n    --do_prefix_attribution \\\n    --dataset_name inseq/dummy_enit \\\n    --input_text_field en \\\n    --dataset_split \"train[:20]\" \\\n    --viz_path attributions.html \\\n    --batch_size 8 \\\n    --hide\n  ```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e\u003ccode\u003einseq attribute-context\u003c/code\u003e example\u003c/summary\u003e\n\n  The following example uses a small LM to generate a continuation of \u003ccode\u003einput_current_text\u003c/code\u003e, and uses the additional context provided by \u003ccode\u003einput_context_text\u003c/code\u003e to estimate its influence on the the generation. In this case, the output \u003ccode\u003e\"to the hospital. He said he was fine\"\u003c/code\u003e is produced, and the generation of token \u003ccode\u003ehospital\u003c/code\u003e is found to be dependent on context token \u003ccode\u003esick\u003c/code\u003e according to the \u003ccode\u003econtrast_prob_diff\u003c/code\u003e step function.\n\n  ```bash\n  inseq attribute-context \\\n    --model_name_or_path HuggingFaceTB/SmolLM-135M \\\n    --input_context_text \"George was sick yesterday.\" \\\n    --input_current_text \"His colleagues asked him to come\" \\\n    --attributed_fn \"contrast_prob_diff\"\n  ```\n\n  **Result:**\n\n  \u003cimg src=\"https://raw.githubusercontent.com/inseq-team/inseq/main/docs/source/images/attribute_context_hospital_output.png\" style=\"width:500px\"\u003e\n\u003c/details\u003e\n\n## Contributing\n\nOur vision for Inseq is to create a centralized, comprehensive and robust set of tools to enable fair and reproducible comparisons in the study of sequence generation models. To achieve this goal, contributions from researchers and developers interested in these topics are more than welcome. Please see our [contributing guidelines](CONTRIBUTING.md) and our [code of conduct](CODE_OF_CONDUCT.md) for more information.\n\n## Citing Inseq\n\nIf you use Inseq in your research we suggest including a mention of the specific release (e.g. v0.6.0) and we kindly ask you to cite our reference paper as:\n\n```bibtex\n@inproceedings{sarti-etal-2023-inseq,\n    title = \"Inseq: An Interpretability Toolkit for Sequence Generation Models\",\n    author = \"Sarti, Gabriele  and\n      Feldhus, Nils  and\n      Sickert, Ludwig  and\n      van der Wal, Oskar and\n      Nissim, Malvina and\n      Bisazza, Arianna\",\n    booktitle = \"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)\",\n    month = jul,\n    year = \"2023\",\n    address = \"Toronto, Canada\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2023.acl-demo.40\",\n    doi = \"10.18653/v1/2023.acl-demo.40\",\n    pages = \"421--435\",\n}\n```\n\n## Research using Inseq\n\nInseq has been used in various research projects. A list of known publications that use Inseq to conduct interpretability analyses of generative models is shown below.\n\n\u003e [!TIP]\n\u003e Last update: August 2024. Please open a pull request to add your publication to the list.\n\n\u003cdetails\u003e\n  \u003csummary\u003e\u003cb\u003e2023\u003c/b\u003e\u003c/summary\u003e\n  \u003col\u003e\n    \u003cli\u003e \u003ca href=\"https://aclanthology.org/2023.acl-demo.40/\"\u003eInseq: An Interpretability Toolkit for Sequence Generation Models\u003c/a\u003e (Sarti et al., 2023) \u003c/li\u003e\n    \u003cli\u003e \u003ca href=\"https://doi.org/10.1162/tacl_a_00651\"\u003eAre Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation\u003c/a\u003e (Edman et al., 2023) \u003c/li\u003e\n    \u003cli\u003e \u003ca href=\"https://aclanthology.org/2023.nlp4convai-1.1/\"\u003eResponse Generation in Longitudinal Dialogues: Which Knowledge Representation Helps?\u003c/a\u003e (Mousavi et al., 2023)  \u003c/li\u003e\n    \u003cli\u003e \u003ca href=\"https://openreview.net/forum?id=XTHfNGI3zT\"\u003eQuantifying the Plausibility of Context Reliance in Neural Machine Translation\u003c/a\u003e (Sarti et al., 2023)\u003c/li\u003e\n    \u003cli\u003e \u003ca href=\"https://aclanthology.org/2023.emnlp-main.243/\"\u003eA Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation\u003c/a\u003e (Attanasio et al., 2023)\u003c/li\u003e\n    \u003cli\u003e \u003ca href=\"https://aclanthology.org/2023.conll-1.18/\"\u003eAttribution and Alignment: Effects of Local Context Repetition on Utterance Production and Comprehension in Dialogue\u003c/a\u003e (Molnar et al., 2023)\u003c/li\u003e\n  \u003c/ol\u003e\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e\u003cb\u003e2024\u003c/b\u003e\u003c/summary\u003e\n  \u003col\u003e\n    \u003cli\u003e \u003ca href=\"https://aclanthology.org/2024.naacl-long.46/\"\u003eAssessing the Reliability of Large Language Model Knowledge\u003c/a\u003e (Wang et al., 2024)\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"https://aclanthology.org/2024.hcinlp-1.9\"\u003eLLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools\u003c/a\u003e (Wang et al., 2024)\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"https://arxiv.org/abs/2402.00794\"\u003eReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models\u003c/a\u003e (Zhao et al., 2024)\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"https://aclanthology.org/2024.naacl-long.284\"\u003eRevisiting subword tokenization: A case study on affixal negation in large language models\u003c/a\u003e (Truong et al., 2024)\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"https://hal.science/hal-04581586\"\u003eExploring NMT Explainability for Translators Using NMT Visualising Tools\u003c/a\u003e (Gonzalez-Saez et al., 2024)\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"https://openreview.net/forum?id=uILj5HPrag\"\u003eDETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning\u003c/a\u003e (Zhou et al., 2024)\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"https://arxiv.org/abs/2406.06399\"\u003eShould We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue\u003c/a\u003e (Alghisi et al., 2024)\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"https://arxiv.org/abs/2406.13663\"\u003eModel Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation\u003c/a\u003e (Qi, Sarti et al., 2024)\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"https://link.springer.com/chapter/10.1007/978-3-031-63787-2_14\"\u003eNoNE Found: Explaining the Output of Sequence-to-Sequence Models When No Named Entity Is Recognized\u003c/a\u003e (dela Cruz et al., 2024)\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"https://dl.acm.org/doi/full/10.1145/3701268.3701274\"\u003eReimagining Student Success Prediction: Applying LLMs in Educational AI with XAI\u003c/a\u003e (Riello et al., 2024)\u003c/li\u003e\n  \u003c/ol\u003e\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e\u003cb\u003e2025\u003c/b\u003e\u003c/summary\u003e\n  \u003col\u003e\n    \u003cli\u003e \u003ca href=\"https://aclanthology.org/2025.findings-naacl.390/\"\u003eReinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting\u003c/a\u003e (Aissi et al., 2025)\u003c/li\u003e\n    \u003cli\u003e \u003ca href=\"https://dl.acm.org/doi/full/10.1145/3729237\"\u003eTowards AI-Assisted Inclusive Language Writing in Italian Formal Communications\u003c/a\u003e (Greco et al. 2025)\u003c/li\u003e\n    \u003cli\u003e \u003ca href=\"https://arxiv.org/abs/2508.08661\"\u003eHallucinations in Code Change to Natural Language Generation: Prevalence and Evaluation of Detection Metrics\u003c/a\u003e (Liu et al. 2025)\u003c/li\u003e\n    \u003cli\u003e \u003ca href=\"https://arxiv.org/abs/2503.05810\"\u003eA Transformer Model for Predicting Chemical Reaction Products from Generic Templates\u003c/a\u003e (Ozer et al. 2025)\u003c/li\u003e\n  \u003c/ol\u003e\n\n\u003c/details\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finseq-team%2Finseq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finseq-team%2Finseq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finseq-team%2Finseq/lists"}