https://github.com/gokulp01/meta-qlearning-humanoid

Meta QLearning experiments to optimize robot walking patterns
https://github.com/gokulp01/meta-qlearning-humanoid

gym gym-environment humanoid humanoid-robot humanoid-walking meta-learning meta-qlearning mujoco mujoco-environments pybullet reinforcement-learning robotics

Last synced: 15 days ago
JSON representation

Meta QLearning experiments to optimize robot walking patterns

Host: GitHub
URL: https://github.com/gokulp01/meta-qlearning-humanoid
Owner: gokulp01
License: mit
Created: 2023-08-07T01:46:20.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-08-21T05:50:25.000Z (9 months ago)
Last Synced: 2025-05-07T21:04:50.636Z (15 days ago)
Topics: gym, gym-environment, humanoid, humanoid-robot, humanoid-walking, meta-learning, meta-qlearning, mujoco, mujoco-environments, pybullet, reinforcement-learning, robotics
Language: Python
Homepage:
Size: 32.7 MB
Stars: 27
Watchers: 1
Forks: 4
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# meta-qlearning-humanoid
Meta QLearning experiments to optimize robot walking patterns
![out](docs/learn_step.gif)

# Overview:
Implemented Meta-Q-Learning for optimizing humanoid walking patterns. We also demonstrate its effectiveness in improving stability, efficiency, and adaptability. Additionally, this work also explores the transferability of Meta-Q-Learning to new tasks with minimal tuning.

## Conducted experiments:
### Learn Stepping using MQL
Test how adaptable the humanoid is by performing:
- Side stepping
- Ascending and Descending

## Setting up the environment:
This repository contains everything needed to set up the environment and get the simulation up and running.

### Clone the repository:
```
git clone [email protected]:gokulp01/meta-qlearning-humanoid.git
```

Make sure the file structure is as follows:
```

├── algs
│   └── MQL
│   ├── buffer.py
│   └── mql.py
├── configs
│   └── abl_envs.json
├── Humanoid_environment
│   ├── envs
│   │   ├── common
│   │   └── jvrc
│   ├── models
│   │   ├── cassie_mj_description
│   │   └── jvrc_mj_description
│   ├── scripts
│   │   ├── debug_stepper.py
│   │   └── plot_logs.py
│   ├── tasks
│   │   │   ├── rewards.cpython-37.pyc
│   │   │   ├── stepping_task.cpython-37.pyc
│   │   │   └── walking_task.cpython-37.pyc
│   │   ├── rewards.py
│   │   ├── stepping_task.py
│   │   └── walking_task.py
│   └── utils
│   └── footstep_plans.txt
├── misc
│   ├── env_meta.py
│   ├── logger.py
│   ├── runner_meta_offpolicy.py
│   ├── runner_multi_snapshot.py
│   ├── torch_utility.py
│   └── utils.py
├── models
│   ├── networks.py
│   └── run.py
├── README.md
└── run_script.py
```

### Installing packages:
```
pip3 install -r requirements.txt
```

### Training
```
python3 run_script.py
```

### Inference
This work was done as a fun project to learn RL and its applications, so I have not drawn a lot of theoretical inferences. That being said, here are some quantitative inferences from the work:
![out](docs/graph1.png)
![out](docs/graph2.png)

## References:
Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, & Alex Smola (2020). Meta-Q-Learning. In ICLR 2020, Microsoft Research Reinforcement Learning Day 2021

### Some important notes:
- Code is written to train using a GPU
- Training time: ~55 hours on RTX 3080
- **Feel free to contact the author for pre-trained model**
- The code is not very well documented (PRs are more than welcome!)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gokulp01/meta-qlearning-humanoid

Awesome Lists containing this project

README