Projects in Awesome Lists by HumanCompatibleAI
A curated list of projects in awesome lists by HumanCompatibleAI .
https://github.com/humancompatibleai/imitation
Clean PyTorch implementations of imitation and reward learning algorithms
gymnasium imitation-learning inverse-reinforcement-learning reward-learning
Last synced: 13 May 2025
https://github.com/HumanCompatibleAI/imitation
Clean PyTorch implementations of imitation and reward learning algorithms
gymnasium imitation-learning inverse-reinforcement-learning reward-learning
Last synced: 04 Apr 2025
https://github.com/humancompatibleai/overcooked_ai
A benchmark environment for fully cooperative human-AI performance.
artificial-intelligence deep-learning machine-learning pytorch reinforcement-learning
Last synced: 11 Oct 2025
https://github.com/HumanCompatibleAI/overcooked_ai
A benchmark environment for fully cooperative human-AI performance.
artificial-intelligence deep-learning machine-learning pytorch reinforcement-learning
Last synced: 24 Apr 2025
https://github.com/humancompatibleai/human_aware_rl
Code for "On the Utility of Learning about Humans for Human-AI Coordination"
Last synced: 01 Aug 2025
https://github.com/humancompatibleai/evaluating-rewards
Library to compare and evaluate reward functions
Last synced: 07 Aug 2025
https://github.com/humancompatibleai/tensor-trust
A prompt injection game to collect data for robust ML research
ctf django game htmx jailbreaks large-language-models llm llms prompt-engineering prompt-injection prompting security
Last synced: 10 Oct 2025
https://github.com/humancompatibleai/overcooked-demo
Web application where humans can play Overcooked with AI agents.
Last synced: 17 Aug 2025
https://github.com/HumanCompatibleAI/overcooked-demo
Web application where humans can play Overcooked with AI agents.
Last synced: 24 Apr 2025
https://github.com/humancompatibleai/seals
Benchmark environments for reward modelling and imitation learning algorithms.
Last synced: 25 Jun 2025
https://github.com/humancompatibleai/rlsp
Reward Learning by Simulating the Past
Last synced: 24 Jun 2025
https://github.com/humancompatibleai/eirli
An Empirical Investigation of Representation Learning for Imitation (EIRLI), NeurIPS'21
imitation-learning machine-learning pytorch representation-learning self-supervised-learning
Last synced: 24 Jun 2025
https://github.com/humancompatibleai/learning-from-human-preferences
Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"
Last synced: 25 Jun 2025
https://github.com/humancompatibleai/ranking-challenge
Testing ranking algorithms to improve social cohesion
Last synced: 25 Jun 2025
https://github.com/humancompatibleai/population-irl
(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
Last synced: 24 Jul 2025
https://github.com/humancompatibleai/deep-rlsp
Code accompanying "Learning What To Do by Simulating the Past", ICLR 2021.
Last synced: 25 Jun 2025
https://github.com/humancompatibleai/learning_biases
Infer how suboptimal agents are suboptimal while planning, for example if they are hyperbolic time discounters.
Last synced: 27 Feb 2026
https://github.com/humancompatibleai/tensor-trust-data
Dataset for the Tensor Trust project
Last synced: 14 Mar 2026
https://github.com/humancompatibleai/overcooked-hai-exp
Overcooked-AI Experiment Psiturk Demo (for MTurk experiments)
Last synced: 10 Aug 2025
https://github.com/humancompatibleai/better-adversarial-defenses
Training in bursts for defending against adversarial policies
adversarial-examples adversarial-policies gym multiagent-reinforcement-learning population-based-training ray reinforcement-learning rllib stable-baselines tensorflow2
Last synced: 11 Jun 2025
https://github.com/humancompatibleai/interpreting-rewards
Experiments in applying interpretability techniques to learned reward functions.
deep-reinforcement-learning interpretability reward-learning
Last synced: 25 Jun 2025
https://github.com/humancompatibleai/nn-clustering-pytorch
Checking the divisibility of neural networks, and investigating the nature of the pieces networks can be divided into.
Last synced: 24 Jun 2025
https://github.com/humancompatibleai/reward-preprocessing
Preprocessing reward functions to make them more interpretable
Last synced: 24 Jun 2025
https://github.com/humancompatibleai/recon-email
Script for automatically creating the reconnaissance email.
Last synced: 29 Jan 2026
https://github.com/humancompatibleai/derail
Supporting code for diagnostic seals paper
Last synced: 24 Jun 2025
https://github.com/humancompatibleai/ranking-challenge-perspective
Prosocial Ranking Challenge Perspective Ranker
Last synced: 24 Jun 2025
https://github.com/humancompatibleai/logical-active-classification
Use active learning to classify data represented as boundaries of regions in parameter space where a parametrised logical formula holds.
Last synced: 24 Jun 2025
https://github.com/humancompatibleai/simulation-awareness
(experimental) RL agents should be more aligned if they do not know whether they are in simulation or in the real world
Last synced: 31 Jul 2025
https://github.com/humancompatibleai/katago-driver-bug-repro
Docker files to help reproduce bug described in https://forums.developer.nvidia.com/t/kernel-oops-null-pointer-dereference-when-closing-cuda-application-katago/211270/3
Last synced: 02 Feb 2026
https://github.com/humancompatibleai/sgf-viewer
A simple webpage that can visualize a sgf string encoded as a url fragment.
Last synced: 21 Jul 2025
https://github.com/humancompatibleai/interactive-behaviour-design-baselines
Last synced: 27 Jan 2026
https://github.com/humancompatibleai/slack-diskbot
low disk space alerts posted to Slack
Last synced: 25 Jun 2025
https://github.com/humancompatibleai/interactive-behaviour-design-basicfetch
Last synced: 25 Jun 2025