{"id":13571000,"url":"https://github.com/jalammar/ecco","last_synced_at":"2025-04-10T03:49:12.325Z","repository":{"id":38227158,"uuid":"310815780","full_name":"jalammar/ecco","owner":"jalammar","description":"Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).","archived":false,"fork":false,"pushed_at":"2024-08-15T19:08:06.000Z","size":4252,"stargazers_count":2024,"open_issues_count":37,"forks_count":172,"subscribers_count":24,"default_branch":"main","last_synced_at":"2025-04-03T02:08:49.638Z","etag":null,"topics":["explorables","language-models","natural-language-processing","nlp","pytorch","visualization"],"latest_commit_sha":null,"homepage":"https://ecco.readthedocs.io","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jalammar.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGELOG.rst","contributing":"CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.rst","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-11-07T10:06:34.000Z","updated_at":"2025-03-30T14:14:46.000Z","dependencies_parsed_at":"2022-07-14T23:46:10.653Z","dependency_job_id":"f185dfee-4616-48a6-919a-1b84480c8efe","html_url":"https://github.com/jalammar/ecco","commit_stats":{"total_commits":272,"total_committers":12,"mean_commits":"22.666666666666668","dds":0.3272058823529411,"last_synced_commit":"2a38a1360c8edb2e4ff4a08fba404b5164c21e06"},"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jalammar%2Fecco","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jalammar%2Fecco/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jalammar%2Fecco/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jalammar%2Fecco/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jalammar","download_url":"https://codeload.github.com/jalammar/ecco/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248154995,"owners_count":21056542,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["explorables","language-models","natural-language-processing","nlp","pytorch","visualization"],"created_at":"2024-08-01T14:00:57.357Z","updated_at":"2025-04-10T03:49:12.292Z","avatar_url":"https://github.com/jalammar.png","language":"Jupyter Notebook","funding_links":[],"categories":["XAI Libraries for NLP","Jupyter Notebook","Table of Contents","Technical Resources"],"sub_categories":["LLM Interpretability Tools","Open Source/Access Responsible AI Software Packages"],"readme":"\n..  image:: https://ar.pegg.io/img/ecco-logo-w-800.png\n    :alt: Ecco Logo\n\n.. start-badges\n|version| |supported-versions|\n\n.. |version| image:: https://img.shields.io/pypi/v/ecco.svg\n    :alt: PyPI Package latest release\n    :target: https://pypi.org/project/ecco\n\n.. |supported-versions| image:: https://img.shields.io/pypi/pyversions/ecco.svg\n    :alt: Supported versions\n    :target: https://pypi.org/project/ecco\n.. end-badges\n\n\nEcco is a python library for explaining Natural Language Processing models using interactive visualizations.\n\nIt provides multiple interfaces to aid the explanation and intuition of `Transformer\n\u003chttps://jalammar.github.io/illustrated-transformer/\u003e`_-based language models. Read: `Interfaces for Explaining Transformer Language Models \u003chttps://jalammar.github.io/explaining-transformers/\u003e`_.\n\nEcco runs inside Jupyter notebooks. It is built on top of `pytorch\n\u003chttps://pytorch.org/\u003e`_ and `transformers\n\u003chttps://github.com/huggingface/transformers\u003e`_.\n\nThe library is currently an alpha release of a research project. Not production ready. You're welcome to contribute to make it better!\n\nInstallation\n============\n\n\n.. code-block:: python\n\n    # Assuming you had PyTorch previously installed\n    pip install ecco\n\n\nDocumentation\n=============\n\n\nTo use the project:\n\n.. code-block:: python\n\n    import ecco\n\n    # Load pre-trained language model. Setting 'activations' to True tells Ecco to capture neuron activations.\n    lm = ecco.from_pretrained('distilgpt2', activations=True)\n\n    # Input text\n    text = \"The countries of the European Union are:\\n1. Austria\\n2. Belgium\\n3. Bulgaria\\n4.\"\n\n    # Generate 20 tokens to complete the input text.\n    output = lm.generate(text, generate=20, do_sample=True)\n    \n    # Ecco will output each token as it is generated.\n    \n    # 'output' now contains the data captured from this run, including the input and output tokens\n    # as well as neuron activations and input saliency values. \n    \n    # To view the input saliency\n    output.saliency()\n\nThis does the following:\n\n1. It loads a pretrained Huggingface DistilGPT2 model. It wraps it an ecco ``LM`` object that does useful things (e.g. it calculates input saliency, can collect neuron activations).\n2. We tell the model to generate 20 tokens.\n3. The model returns an ecco ``OutputSeq`` object. This object holds the output sequence, but also a lot of data generated by the generation run, including the input sequence and input saliency values. If we set ``activations=True`` in ``from_pretrained()``, then this would also contain neuron activation values.\n4. ``output`` can now produce various interactive explorables. Examples include:\n\n- ``output.saliency()`` to generate input saliency explorable [`Input Saliency Colab Notebook \u003chttps://colab.research.google.com/github/jalammar/ecco/blob/main/notebooks/Ecco_Input_Saliency.ipynb\u003e`_]\n- ``output.run_nmf()`` to to explore non-negative matrix factorization of neuron activations  [`Neuron Activation Colab Notebook \u003chttps://colab.research.google.com/github/jalammar/ecco/blob/main/notebooks/Ecco_Neuron_Factors.ipynb\u003e`_]\n\n\n.. code-block:: python\n\n    # To view the input saliency explorable\n    output.saliency()\n    \n    # to view input saliency with more details (a bar and % value for each token)\n    output.saliency(style=\"detailed\")\n    \n    # output.activations contains the neuron activation values. it has the shape: (layer, neuron, token position)\n    \n    # We can run non-negative matrix factorization using run_nmf. We pass the number of factors/components to break down into\n    nmf_1 = output.run_nmf(n_components=10) \n\n    # nmf_1 now contains the necessary data to create the interactive nmf explorable:\n    nmf_1.explore()\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjalammar%2Fecco","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjalammar%2Fecco","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjalammar%2Fecco/lists"}