Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cm-bf/csce689-project
https://github.com/cm-bf/csce689-project
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/cm-bf/csce689-project
- Owner: CM-BF
- License: other
- Created: 2022-12-06T00:46:10.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2022-12-06T00:52:23.000Z (almost 2 years ago)
- Last Synced: 2023-07-14T17:26:31.655Z (over 1 year ago)
- Language: Python
- Size: 4.79 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# CSCE689 project
## Introduction
In traditional machine learning, it is generally assumed that training and test data are i.i.d.. However, in the common case where the i.i.d. assumption does not hold, model performance could experience significant drops during test period. Out-of-distribution (OOD) learning focuses on scenarios where the training distribution is different from test distribution. The omnipresence of the OOD problem across all machine learning fields makes OOD generalization a critical area of research, which also applies for reinforcement learning studies.
Reinforcement learning (RL) provides a powerful framework that can train an agent to take proper sequential actions based on the states. To apply RL models in reality without significant OOD performance drop, we expect to generalize the agents trained in training environments to unseen testing environments. OOD generalization in RL is only an emerging area of research, and many existing works make use of extra data from the testing environments. Since the testing distribution is commonly unknown in real-world applications, an effective generalizable policy that only accesses data from training environments is urgently needed.One of the solutions for the OOD problem is to discover the factors of variation that affect the environmental dynamics, from which a generalizable policy can be learned.
## Experiments
In reinforcement learning, the main experiment can be run in `reinforcement_learning` directory with
```
MUJOCO_GL=egl DOMAIN=cartpole TASK=swingup SAVEDIR=./save CUDA_VISIBLE_DEVICES=0 python train.py env=cartpole_swingup experiment=cartpole_swingup agent=vrex seed=1 agent.params.penalty_weight=1
```