An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by EleutherAI

A curated list of projects in awesome lists by EleutherAI .

https://github.com/eleutherai/lm-evaluation-harness

A framework for few-shot evaluation of language models.

evaluation-framework language-model transformer

Last synced: 09 Sep 2025

https://github.com/EleutherAI/gpt-neo

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

gpt gpt-2 gpt-3 language-model transformers

Last synced: 02 Apr 2025

https://github.com/eleutherai/gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

deepspeed-library gpt-3 language-model transformers

Last synced: 12 May 2025

https://github.com/EleutherAI/lm-evaluation-harness

A framework for few-shot evaluation of language models.

evaluation-framework language-model transformer

Last synced: 23 Mar 2025

https://github.com/EleutherAI/gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

deepspeed-library gpt-3 language-model transformers

Last synced: 27 Mar 2025

https://github.com/eleutherai/pythia

The hub for EleutherAI's work on interpretability and learning dynamics

Last synced: 13 May 2025

https://github.com/EleutherAI/pythia

The hub for EleutherAI's work on interpretability and learning dynamics

Last synced: 26 Mar 2025

https://github.com/eleutherai/the-pile

Last synced: 06 Oct 2025

https://github.com/EleutherAI/the-pile

Last synced: 16 Apr 2025

https://github.com/eleutherai/math-lm

Last synced: 16 May 2025

https://github.com/EleutherAI/math-lm

Last synced: 09 May 2025

https://github.com/eleutherai/cookbook

Deep learning for dummies. All the practical details and useful utilities that go into working with real models.

Last synced: 15 May 2025

https://github.com/EleutherAI/cookbook

Deep learning for dummies. All the practical details and useful utilities that go into working with real models.

Last synced: 27 Mar 2025

https://github.com/eleutherai/sparsify

Sparsify transformers with SAEs and transcoders

Last synced: 15 May 2025

https://github.com/eleutherai/polyglot

Polyglot: Large Language Models of Well-balanced Competence in Multi-languages

Last synced: 28 Jan 2026

https://github.com/EleutherAI/polyglot

Polyglot: Large Language Models of Well-balanced Competence in Multi-languages

Last synced: 03 Apr 2025

https://github.com/EleutherAI/DALLE-mtf

Open-AI's DALL-E for large scale training in mesh-tensorflow.

artificial-intelligence autoregressive multimodal text-to-image transformers variational-autoencoder

Last synced: 19 Jul 2025

https://github.com/eleutherai/dalle-mtf

Open-AI's DALL-E for large scale training in mesh-tensorflow.

artificial-intelligence autoregressive multimodal text-to-image transformers variational-autoencoder

Last synced: 05 Apr 2025

https://github.com/EleutherAI/sparsify

Sparse autoencoders

Last synced: 18 Oct 2025

https://github.com/eleutherai/concept-erasure

Erasing concepts from neural representations with provable guarantees

Last synced: 04 Apr 2025

https://github.com/EleutherAI/concept-erasure

Erasing concepts from neural representations with provable guarantees

Last synced: 24 Mar 2025

https://github.com/eleutherai/elk

Keeping language models honest by directly eliciting knowledge encoded in their activations.

Last synced: 12 Apr 2025

https://github.com/eleutherai/oslo

OSLO: Open Source for Large-scale Optimization

Last synced: 08 Mar 2026

https://github.com/eleutherai/deeperspeed

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

Last synced: 14 Jan 2026

https://github.com/eleutherai/delphi

Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models know themselves through automated interpretability.

Last synced: 04 Apr 2025

https://github.com/eleutherai/knowledge-neurons

A library for finding knowledge neurons in pretrained transformer models.

interpretability transformers

Last synced: 07 Nov 2025

https://github.com/EleutherAI/knowledge-neurons

A library for finding knowledge neurons in pretrained transformer models.

interpretability transformers

Last synced: 08 May 2025

https://github.com/EleutherAI/delphi

Last synced: 18 Oct 2025

https://github.com/EleutherAI/pyfra

Python Research Framework

Last synced: 08 May 2025

https://github.com/eleutherai/pyfra

Python Research Framework

Last synced: 15 Oct 2025

https://github.com/EleutherAI/dps

Data processing system for polyglot

Last synced: 26 Jan 2026

https://github.com/EleutherAI/nanoGPT-mup

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Last synced: 27 Jan 2026

https://github.com/EleutherAI/stackexchange-dataset

Python tools for processing the stackexchange data dumps into a text dataset for Language Models

Last synced: 22 Jul 2025

https://github.com/eleutherai/improved-t5

Experiments for efforts to train a new and improved t5

Last synced: 25 Feb 2026

https://github.com/eleutherai/stackexchange-dataset

Python tools for processing the stackexchange data dumps into a text dataset for Language Models

Last synced: 24 Apr 2025

https://github.com/eleutherai/project-menu

See the issue board for the current status of active and prospective projects!

Last synced: 01 Mar 2026

https://github.com/eleutherai/dps

Data processing system for polyglot

Last synced: 24 Apr 2025

https://github.com/eleutherai/magicarp

One stop shop for all things carp

Last synced: 24 Apr 2025

https://github.com/eleutherai/aria-amt

Efficient and robust implementation of seq-to-seq automatic piano transcription.

Last synced: 04 Oct 2025

https://github.com/EleutherAI/aria

Last synced: 14 Jul 2025

https://github.com/eleutherai/tqdm-multiprocess

Using queues, tqdm-multiprocess supports multiple worker processes, each with multiple tqdm progress bars, displaying them cleanly through the main process. It offers similar functionality for python logging.

Last synced: 24 Apr 2025

https://github.com/eleutherai/aria

Last synced: 24 Apr 2025

https://github.com/eleutherai/rnngineering

Engineering the state of RNN language models (Mamba, RWKV, etc.)

Last synced: 14 Aug 2025

https://github.com/eleutherai/hae-rae

Last synced: 11 Feb 2026

https://github.com/eleutherai/mp_nerf

Massively-Parallel Natural Extension of Reference Frame

Last synced: 15 Jul 2025

https://github.com/eleutherai/bergson

Mapping out the "memory" of neural nets with data attribution

Last synced: 10 Apr 2026

https://github.com/eleutherai/features-across-time

Understanding how features learned by neural networks evolve throughout training

Last synced: 24 Apr 2025

https://github.com/eleutherai/polyglot-data

data related codebase for polyglot project

Last synced: 06 Oct 2025

https://github.com/eleutherai/best-download

URL downloader supporting checkpointing and continuous checksumming.

Last synced: 24 Apr 2025

https://github.com/eleutherai/elk-generalization

Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from easy questions to hard

Last synced: 24 Apr 2025

https://github.com/eleutherai/pile_dedupe

Pile Deduplication Code

Last synced: 14 Apr 2025

https://github.com/eleutherai/text-generation-testing-ui

Web app for demoing the EAI models

Last synced: 12 Mar 2026

https://github.com/eleutherai/mdl

Minimum Description Length probing for neural network representations

Last synced: 24 Apr 2025

https://github.com/eleutherai/tokengrams

Efficiently computing & storing token n-grams from large corpora

Last synced: 17 Jul 2025

https://github.com/eleutherai/pilev2

Last synced: 24 Apr 2025

https://github.com/eleutherai/equivariance

A framework for implementing equivariant DL

Last synced: 15 Aug 2025

https://github.com/eleutherai/radioactive-lab

Adapting the "Radioactive Data" paper to work for text models

Last synced: 24 Apr 2025

https://github.com/eleutherai/pile-uspto

A script for collecting the USPTO Backgrounds dataset in a language modelling friendly format.

Last synced: 11 Jun 2025

https://github.com/eleutherai/tagged-pile

Part-of-Speech Tagging for the Pile and RedPajama

Last synced: 02 Mar 2026

https://github.com/eleutherai/llemma-sample-explorer

Sample explorer tool for the Llemma models.

Last synced: 02 Feb 2026

https://github.com/eleutherai/pile-pubmedcentral

A script for collecting the PubMed Central dataset in a language modelling friendly format.

Last synced: 24 Apr 2025

https://github.com/eleutherai/djinn

Generating, validating and running exploitable verifiable coding problems

Last synced: 17 Feb 2026

https://github.com/eleutherai/tyche

Precisely estimating the volume of basins in neural net parameter space corresponding to interpretable behaviors

Last synced: 30 Apr 2025

https://github.com/eleutherai/minetest-baselines

Baseline agents for Minetest tasks.

Last synced: 24 Apr 2025

https://github.com/eleutherai/minetest-interpretabilty-notebook

Jupyter notebook for the interpretablity section of the minetester blog post

Last synced: 18 Oct 2025

https://github.com/eleutherai/pile-literotica

Download, parse, and filter data from Literotica. Data-ready for The-Pile.

Last synced: 24 Apr 2025

https://github.com/eleutherai/pile-cc-filtering

The code used to filter CC data for The Pile

Last synced: 14 Apr 2025

https://github.com/eleutherai/codecarp

Data collection pipeline for CodeCARP. Includes PyCharm plugins.

Last synced: 25 Feb 2025

https://github.com/eleutherai/architecture-experiments

Repository to host architecture experiments and development using Paxml and Praxis

Last synced: 24 Apr 2025

https://github.com/eleutherai/visual-grounding

Visually ground GPT-Neo 1.3b and 2.7b

Last synced: 26 Jul 2025

https://github.com/eleutherai/pile-explorer

For exploring the data and documenting its limitations

Last synced: 10 Jul 2025

https://github.com/eleutherai/thonkenizers

yes

Last synced: 12 Mar 2026

https://github.com/eleutherai/eleutherai.github.io

This is the Hugo generated website for eleuther.ai. The source of this build is new-website repo.

Last synced: 14 Feb 2026

https://github.com/eleutherai/website

New website for EleutherAI based on Hugo static site generator

Last synced: 25 Feb 2025

https://github.com/eleutherai/lm-scope

Last synced: 24 Apr 2025

https://github.com/EleutherAI/alignment-handbook

Robust recipes for to align language models with human and AI preferences

Last synced: 04 Mar 2026

https://github.com/eleutherai/optax-galore

Adds GaLore style projection wrappers to optax optimizers

Last synced: 14 Apr 2025

https://github.com/eleutherai/pile-allpoetry

Scraper to gather poems from allpoetry.com

Last synced: 14 Apr 2025

https://github.com/eleutherai/evilmodel

A replication of "EvilModel 2.0: Bringing Neural Network Models into Malware Attacks"

Last synced: 19 Mar 2026

https://github.com/eleutherai/variance-across-time

Studying the variance in neural net predictions across training time

Last synced: 14 Feb 2026

https://github.com/eleutherai/eai-prompt-gallery

Library of interesting prompt generations

Last synced: 12 May 2026

https://github.com/eleutherai/isaac-mchorse

EleutherAI's discord bot

Last synced: 15 Jun 2025

https://github.com/eleutherai/pile-ubuntu-irc

A script for collecting the Ubuntu IRC dataset in a language modelling friendly format.

Last synced: 19 Nov 2025

https://github.com/eleutherai/bucket-cleaner

A small utility to clear out old model checkpoints in Google Cloud Buckets whilst keeping tensorboard event files

Last synced: 23 Aug 2025

https://github.com/eleutherai/latent-video-diffusion

Latent video diffusion

Last synced: 24 Apr 2025

https://github.com/eleutherai/sae_overlap

Acompanying code for our research on SAE feature overlap when trained on different seeds.

Last synced: 06 Oct 2025

https://github.com/eleutherai/ccs

Last synced: 24 Apr 2025

https://github.com/eleutherai/attention-probes

Linear probes with attention weighting

Last synced: 06 Sep 2025

https://github.com/eleutherai/pile-enron-emails

A script for collecting the Enron Emails dataset in a language modelling friendly format.

Last synced: 24 Apr 2025