Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with sparse-autoencoder

A curated list of projects in awesome lists tagged with sparse-autoencoder .

https://github.com/explanare/ravel

Evaluate interpretability methods on localizing and disentangling concepts in LLMs.

causal-intervention disentangled-representations interpretability intervention probing sparse-autoencoder

Last synced: 15 Nov 2024

https://github.com/paulpauls/llama3_interpretability_sae

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.

feature-extraction feature-steering llama3 llm-interpretability open-research pytorch sparse-autoencoder

Last synced: 21 Nov 2024

https://github.com/seonglae/emgsd-hermes

Steering GPT2-EMGSD less biased & Generating stereotyped text with vanilla GPT2 without fine tuning or prompt engineering

bias-correction bias-mitigation emgsd gpt2 sparse-autoencoder steering-vector stereotype

Last synced: 12 Dec 2024