https://github.com/humancompatibleai/interpreting-rewards
Experiments in applying interpretability techniques to learned reward functions.
https://github.com/humancompatibleai/interpreting-rewards
deep-reinforcement-learning interpretability reward-learning
Last synced: 11 months ago
JSON representation
Experiments in applying interpretability techniques to learned reward functions.
- Host: GitHub
- URL: https://github.com/humancompatibleai/interpreting-rewards
- Owner: HumanCompatibleAI
- Created: 2020-05-28T00:19:26.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-12-11T04:25:55.000Z (over 5 years ago)
- Last Synced: 2025-05-30T03:40:40.578Z (about 1 year ago)
- Topics: deep-reinforcement-learning, interpretability, reward-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 6.29 MB
- Stars: 10
- Watchers: 4
- Forks: 1
- Open Issues: 0