{"id":28960563,"url":"https://github.com/aygp-dr/attribution-graphs-explorer","last_synced_at":"2026-02-15T01:02:41.754Z","repository":{"id":296615827,"uuid":"993961143","full_name":"aygp-dr/attribution-graphs-explorer","owner":"aygp-dr","description":"A toolkit for exploring attribution graphs and circuit tracing in transformer models, implemented in Guile Scheme","archived":false,"fork":false,"pushed_at":"2025-06-06T23:15:44.000Z","size":186,"stargazers_count":0,"open_issues_count":8,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-19T20:11:39.440Z","etag":null,"topics":["attribution-graphs","circuit-tracing","machine-learning","mechanistic-interpretability","scheme","transformer-interpretability"],"latest_commit_sha":null,"homepage":null,"language":"Scheme","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aygp-dr.png","metadata":{"files":{"readme":"README.org","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-31T22:25:43.000Z","updated_at":"2025-06-06T23:15:47.000Z","dependencies_parsed_at":"2025-06-01T09:44:50.028Z","dependency_job_id":"31e1500f-b459-45e5-88e5-03fa64818854","html_url":"https://github.com/aygp-dr/attribution-graphs-explorer","commit_stats":null,"previous_names":["aygp-dr/attribution-graphs-explorer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aygp-dr/attribution-graphs-explorer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aygp-dr%2Fattribution-graphs-explorer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aygp-dr%2Fattribution-graphs-explorer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aygp-dr%2Fattribution-graphs-explorer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aygp-dr%2Fattribution-graphs-explorer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aygp-dr","download_url":"https://codeload.github.com/aygp-dr/attribution-graphs-explorer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aygp-dr%2Fattribution-graphs-explorer/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261586579,"owners_count":23181137,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["attribution-graphs","circuit-tracing","machine-learning","mechanistic-interpretability","scheme","transformer-interpretability"],"created_at":"2025-06-24T01:31:43.672Z","updated_at":"2026-02-15T01:02:32.712Z","avatar_url":"https://github.com/aygp-dr.png","language":"Scheme","funding_links":[],"categories":[],"sub_categories":[],"readme":"#+TITLE: Attribution Graphs Explorer\n#+AUTHOR: AYGP-DR Research Team\n#+OPTIONS: toc:3 num:t\n\n* Attribution Graphs Explorer\n\n[[https://img.shields.io/badge/Version-0.1.0-blue.svg][https://img.shields.io/badge/Version-0.1.0-blue.svg]]\n[[https://img.shields.io/badge/Guile-3.0+-blue.svg][https://img.shields.io/badge/Guile-3.0+-blue.svg]]\n[[https://img.shields.io/badge/License-MIT-green.svg][https://img.shields.io/badge/License-MIT-green.svg]]\n[[https://img.shields.io/badge/Status-Alpha-orange.svg][https://img.shields.io/badge/Status-Alpha-orange.svg]]\n\nA toolkit for exploring attribution graphs and circuit tracing in transformer models, implemented in Guile Scheme.\n\n** Overview\n\nAttribution Graphs Explorer is a framework for mechanistic interpretability of transformer models based on circuit tracing methods. The toolkit allows researchers to:\n\n- Extract computational circuits from neural networks\n- Trace linear paths of information flow\n- Visualize attribution graphs\n- Test causal hypotheses through perturbation\n\nOur implementation builds on the methods described in the [[https://transformer-circuits.pub/2025/attribution-graphs/methods.html][Attribution Graphs]] research, allowing for programmatic analysis of neural network internals.\n\n[[file:docs/images/overview.png]]\n\n** Architecture\n\nThe toolkit is organized around several key components:\n\n#+begin_src mermaid :file docs/architecture.png :mkdirp t\ngraph TD\n    A[Input Tokens] --\u003e B[Token Embeddings]\n    B --\u003e C[Cross-Layer Transcoder]\n    C --\u003e D[Feature Activations]\n    D --\u003e E[Attribution Graph]\n    E --\u003e F[Output Logits]\n    \n    style C fill:#f9f,stroke:#333,stroke-width:4px\n    style E fill:#bbf,stroke:#333,stroke-width:4px\n#+end_src\n\n*** Cross-Layer Transcoders (CLT)\n\nCross-Layer Transcoders provide a way to bypass MLP nonlinearities, creating linear feature-to-feature interactions that can be traced through the network. The CLT modules:\n\n- Read from the residual stream at one layer\n- Contribute to all subsequent MLP layers\n- Maintain sparse feature representations\n\n*** Attribution Graphs\n\nAttribution graphs represent the computational flow as a directed graph:\n\n- *Nodes*: Features, tokens, and logits\n- *Edges*: Attribution weights between features\n- *Paths*: Computational circuits through the network\n\n*** Circuit Discovery\n\nThe toolkit provides algorithms for finding interpretable circuits:\n\n- Path tracing algorithms\n- Circuit motif identification\n- Circuit visualization\n\n*** Validation Framework\n\nTest hypotheses about discovered circuits:\n\n- Perturbation experiments\n- Causal validation\n- Sparsity and concentration metrics\n\n** Getting Started\n\n*** Installation\n\n#+begin_src shell\n# Clone the repository\ngit clone https://github.com/aygp-dr/attribution-graphs-explorer.git\ncd attribution-graphs-explorer\n\n# Configure and build\n./configure\ngmake\n\n# Run tests to verify installation\ngmake test\n\n# Run examples\ngmake run\n#+end_src\n\n*** Requirements\n\nThis project has been developed and tested with the following environment:\n\n#+CAPTION: Development Environment\n#+ATTR_HTML: :border 2 :rules all :frame border\n| *Component*      | *Version*       | *Notes*                                  |\n|------------------+-----------------+------------------------------------------|\n| Operating System | FreeBSD 14.2    | Should work on most Unix-like systems    |\n| Guile            | 3.0.10          | *Minimum 3.0 required*                   |\n| GNU Make         | 4.4.1           | gmake on FreeBSD                         |\n| GNU Grep         | 3.11            | ggrep on FreeBSD                         |\n| GNU Awk          | 5.3.2           | gawk on FreeBSD                          |\n| Direnv           | 2.35.0          | For environment management               |\n| Emacs            | 30.1            | For org-mode processing and documentation |\n\n**** Required Guile Modules\n\n- SRFI libraries: srfi-1, srfi-9, srfi-43\n- ice-9 regex\n\n*** Basic Usage\n\n#+begin_src scheme\n;; Load the framework\n(add-to-load-path \"/path/to/attribution-graphs-explorer\")\n(use-modules (attribution-graphs clt transcoder)\n             (attribution-graphs graph attribution)\n             (attribution-graphs circuits discovery))\n\n;; Create a cross-layer transcoder\n(define my-clt (make-clt 5 '(6 7 8) 768 128 768))\n\n;; Generate attribution graph\n(define graph (compute-attribution-graph my-clt \"Example input\" 'last-token))\n\n;; Find and visualize circuits\n(define circuits (find-circuits graph))\n(display (circuit-\u003emermaid circuits graph))\n#+end_src\n\n** Examples\n\nThe repository includes example applications:\n\n*** Poetry Generation Circuit\n\nAnalyzes how transformer models plan rhyming in poetry:\n\n#+begin_src scheme\n(use-modules (attribution-graphs examples poetry-circuit))\n(analyze-poetry-planning model \"Roses are red\\nViolets are \")\n#+end_src\n\n*** Multi-hop Reasoning Circuit\n\nTraces factual recall with intermediate reasoning steps:\n\n#+begin_src scheme\n(use-modules (attribution-graphs examples reasoning-circuit))\n(analyze-multihop-reasoning model \"The capital of the state containing Dallas is\")\n#+end_src\n\n** Development Status\n\nThis project is currently in *alpha* status (version 0.1.0). The core framework is implemented and functional, but several components are placeholders for demonstration purposes:\n\n*** Implemented Features\n- Core data structures (CLT, attribution graphs, nodes, edges)\n- Basic mathematical operations (activation functions, matrix operations)\n- Graph construction and manipulation\n- Circuit discovery algorithms\n- Visualization generation (Mermaid diagrams)\n- Example applications (poetry and reasoning circuits)\n- Test framework\n\n*** Limitations\n- Matrix operations use simplified random generation\n- Some functions are placeholders (e.g., =embed-tokens=, =find-edge=)\n- No actual model integration (uses mock CLT instances)\n- Limited validation framework\n- No real-world model examples\n\n*** Future Development\n- Integration with actual transformer models\n- Improved matrix operations and numerical methods\n- Enhanced visualization capabilities\n- More comprehensive test suite\n- Performance optimizations\n- Additional circuit discovery algorithms\n\n** Research Context\n\nThis toolkit builds on recent work in mechanistic interpretability of large language models:\n\n- [[https://transformer-circuits.pub/2025/attribution-graphs/methods.html][Attribution Graphs Methods]] - The core technical approach\n- [[https://transformer-circuits.pub/2025/attribution-graphs/biology.html][Attribution Graphs Biology]] - Application to biological knowledge\n- [[https://transformer-circuits.pub/][Transformer Circuits]] - Broader context of circuit analysis\n- [[https://distill.pub/2020/circuits/][Circuits: Zoom In on Neurons]] - Foundational work on circuit analysis in vision models\n\n** License\n\nMIT License\n\n** Citation\n\nIf you use this toolkit in your research, please cite:\n\n#+begin_src bibtex\n@software{attribution_graphs_explorer,\n  author = {AYGP-DR Research Team},\n  title = {Attribution Graphs Explorer: A Toolkit for Circuit Tracing in Transformer Models},\n  url = {https://github.com/aygp-dr/attribution-graphs-explorer},\n  year = {2025},\n}\n#+end_src","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faygp-dr%2Fattribution-graphs-explorer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faygp-dr%2Fattribution-graphs-explorer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faygp-dr%2Fattribution-graphs-explorer/lists"}