https://github.com/humancompatibleai/interpreting-rewards

Experiments in applying interpretability techniques to learned reward functions.
https://github.com/humancompatibleai/interpreting-rewards

deep-reinforcement-learning interpretability reward-learning

Last synced: about 1 year ago
JSON representation

Experiments in applying interpretability techniques to learned reward functions.

Host: GitHub
URL: https://github.com/humancompatibleai/interpreting-rewards
Owner: HumanCompatibleAI
Created: 2020-05-28T00:19:26.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2020-12-11T04:25:55.000Z (over 5 years ago)
Last Synced: 2025-05-30T03:40:40.578Z (about 1 year ago)
Topics: deep-reinforcement-learning, interpretability, reward-learning
Language: Jupyter Notebook
Homepage:
Size: 6.29 MB
Stars: 10
Watchers: 4
Forks: 1
Open Issues: 0

Awesome Lists containing this project