Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-interpretability
Awesome tools for interpreting, manipulating the internals of of deep neural networks.
https://github.com/wassname/awesome-interpretability
Last synced: 3 days ago
JSON representation
-
Structured output
- outlines
- jsonformer
- Microsoft Guidance
- lmql.ai
- llama.cpp grammar
- langchain output_parsers
- salute - typescript
- clownfish - 2023 Modifying Transformers to Follow a JSON Schema - not updated
- relm - 2023 Regular Expression engine for Language Models - not updated
- Constrained-Text-Generation-Studio
- kor
- lm-format-enforcer - remote api's
- Promptify
- prob_jsonformer - Jsonformer, but it can output the probablity of each choice in a single pass
-
Explainability, counterfactuals and probing
-
Mechanistic interpretability
- nnsight - team/nnsight?style=social)
- To customize a model, instead of running it as a function, you run it as a "with" context. Inside "with" you can write regular pytorch to modify the computation.
- Pyvene (intervention focused)
- pyvene tries to be HuggingFace-native, supporting pre-defined interventions or customized interventions (below).
- TransformerLens - io/TransformerLens?style=social)
- an extremely opinionated toolkit for doing whatever you want to specific models,
- BauKit - light, simple, and well loved
- penzai - deepmind/penzai?style=social) - jax-based, not HuggingFace-native
- Transformer Debugger (OpenAI) - debugger?style=social) - not HuggingFace-native
- Graphpatch - lloyd/graphpatch?style=social) - promising but abandoned
- NeuroX
- A tutorial on doing it manually
-
See more
Programming Languages
Categories
Sub Categories
Keywords
interpretability
7
machine-learning
6
explainable-ai
5
deep-learning
4
python
3
llm
3
pytorch
3
large-language-models
3
natural-language-processing
3
explainable-ml
3
openai
2
ai
2
neural-networks
2
transformers
2
feature-importance
2
artificial-intelligence
2
nlp
2
xai
2
mechanistic-interpretability
2
mimic-explainer
1
interpretable-models
1
explanationdashboard
1
explainer
1
trusted-ml
1
tabular-explainer
1
activation-intervention
1
activation-patching
1
intervention
1
attribution-methods
1
captum
1
generative-ai
1
huggingface
1
language-generation
1
language-model
1
sequence-to-sequence
1
fairness
1
interpretable-machine-learning
1
responsible-ai
1
robustness
1
codait
1
explainabil
1
ibm-research
1
ibm-research-ai
1
trusted-ai
1
fine-tuning
1
langchain
1
regex
1
information-extraction
1
natural-language
1
natural-language-understanding
1