https://github.com/gregwar/mjlab_cartpole
Cartpole minimal example using mjlab as external dependency
https://github.com/gregwar/mjlab_cartpole
Last synced: 4 months ago
JSON representation
Cartpole minimal example using mjlab as external dependency
- Host: GitHub
- URL: https://github.com/gregwar/mjlab_cartpole
- Owner: Gregwar
- Created: 2025-10-30T09:19:51.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2026-01-30T16:27:56.000Z (5 months ago)
- Last Synced: 2026-01-31T09:40:42.041Z (5 months ago)
- Language: Python
- Homepage:
- Size: 604 KB
- Stars: 13
- Watchers: 0
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Mjlab Cartpole example
This is an example of a simple CartPole environment using [mjlab](https://github.com/mujocolab/mjlab/) for training. This repository is both a pedagogical example using mjlab and an example of organization using mjlab as an external dependency.
## Installing
Simply run:
```
uv sync
```
## Running the training
To run the training, use:
```
uv run train Mjlab-Cartpole
```
## Playing the environment
Run:
```
uv run play Mjlab-Cartpole --checkpoint-file [path-to-checkpoint]
```
The checkpoint will typically appear in `logs/rsl_rl/exp1/[date]/model_499.pt`
## Organization
The structure is as follow:
* `src/mjlab_cartpole/`
* `robot/`: Robot model
* `xmls/cartpole.xml`: MuJoCo MJCF spec
* `cartpole_constants.py`: Describing how to load the robot entity (mostly loading the XML spec here)
* `tasks/`: Task definition
* `__init__.py`: Environments are registered here
* `cartpole_env_cfg.py`: Environment configurations (actions, rewards, termination, reset etc.)
Here, mjlab is used as an external dependency.
## Environment
### Action
Actions is the force $f$ applied to the cart (Newton).
## Observation
Agent observation is the pole angle and velocity and the cart position and velocity.
### Reward
The goal is for the pole not to fall, while navigating the cart to the center, the reward is as follows:
$$
r = 5 \times cos(\theta) + exp(-\frac{x^2}{\sigma^2}) - 10^{-2} ({\frac{f}{20}})^2
$$
Where:
* $\theta$ is the pole angle (0 is upright)
* $x$ is the cart position
* $\sigma = 0.3$ is a deviation to the center
* $f$ is the applied cart force (command)
### Termination & truncation
A termination is issued when $|\theta| > 30 \space deg$ and a timeout happens after 10s.
## References
* This is inspired by the [Creating a New Task](https://github.com/mujocolab/mjlab/blob/main/docs/create_new_task.md) markdown from mjlab