https://github.com/zozo123/env-rosetta
Same Wordle env, 4 RL frameworks, 4 parallel islo.dev sandboxes — Rosetta stone for RL env design (built on @adithya-s-k's RL_Envs_101)
https://github.com/zozo123/env-rosetta
Last synced: 11 days ago
JSON representation
Same Wordle env, 4 RL frameworks, 4 parallel islo.dev sandboxes — Rosetta stone for RL env design (built on @adithya-s-k's RL_Envs_101)
- Host: GitHub
- URL: https://github.com/zozo123/env-rosetta
- Owner: zozo123
- Created: 2026-05-11T18:19:30.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-11T18:23:37.000Z (about 2 months ago)
- Last Synced: 2026-05-11T20:26:53.125Z (about 2 months ago)
- Language: Python
- Size: 11.9 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# env-rosetta
**The same Wordle env, four RL frameworks, four parallel [islo.dev](https://islo.dev/) sandboxes.**
> **Project page:** https://zozo123.github.io/env-rosetta-page/
>
> **Writeup:** [`docs/POST.md`](./docs/POST.md)
>
> **Inspired by:** [@adithya_s_k](https://x.com/adithya_s_k)'s [RL_Envs_101](https://github.com/adithya-s-k/RL_Envs_101) — same env reimplemented across 6 frameworks as a "Rosetta stone" for RL env design. This is that Rosetta stone hosted **per-sandbox** on islo.dev instead of one shared HuggingFace Space.
>
> **Sibling:** [unity-loop](https://github.com/zozo123/unity-loop) (Claude-vision tournament over Unity WebGL variants on islo).
---
## What this is
Four islo.dev sandboxes. Each runs Adithya's Wordle env in a different RL framework dialect:
| Sandbox | Framework | Server style | Live URL |
|---|---|---|---|
| `rosetta-openenv` | [OpenEnv](https://github.com/meta-pytorch/OpenEnv) (Meta) | FastAPI + MCP | `cards/openenv.json#share_url` |
| `rosetta-ors` | [ORS](https://github.com/openreward/openreward) (Open Reward Standard) | HTTP | `cards/ors.json#share_url` |
| `rosetta-nemo-gym` | [NeMo Gym](https://github.com/NVIDIA/NeMo-RL) (NVIDIA) | HTTP | `cards/nemo_gym.json#share_url` |
| (in-process) | [Verifiers](https://github.com/willccbb/verifiers) | In-process (no server) | rollout transcript only |
The same MCP/HTTP shape — `guess(word)`, `get_history()`, `reset_game()` — exposed by every server. Same Wordle word every time you click in (seeded per session). Different framework idioms behind the curtain.
## Why islo.dev for this
Adithya's reference deployments live on HuggingFace Spaces (one Space per framework). HF Spaces is fine for static demos but you don't get a fresh per-trial sandbox, you can't `ssh` into one, and you can't run parallel rollouts without sharing state. **islo.dev gives you cold per-rollout sandboxes** — same code, but you can spin K parallel workers, each isolated, each a one-line provision.
## Quick start
```bash
bin/env-rosetta provision # spawn 4 islo sandboxes, install deps, start each server, share ports
bin/env-rosetta rollout # run one LLM rollout against each framework, capture transcripts
bin/env-rosetta page # update cards/*.json (which the gh-pages site reads)
```
Or just look at [`cards/`](./cards) — each framework's metadata, share URL, sample code, and a recorded rollout transcript are versioned there.
## Tier 2 — Jupyter env (the real "islo replaces E2B" story)
Adithya's Jupyter agent env uses `e2b-code-interpreter` for real Python code execution. Porting that to islo is a separate workstream — write an `IsloSandbox` class with the same `run_code` interface as `E2BSandbox`, plumb it through `envs/jupyter_env//e2b_sandbox.py` in all 4 frameworks. Sketched in [`docs/POST.md#tier-2`](./docs/POST.md#tier-2-jupyter-env--islo-replaces-e2b). The Wordle rosetta in this repo is the foundation; Tier 2 is the killer demo.
## Credits
- Envs and framework adapters — [@adithya-s-k](https://github.com/adithya-s-k/RL_Envs_101)
- Sandbox infra — [islo.dev](https://islo.dev/)
- Pattern — [unity-loop](https://github.com/zozo123/unity-loop) · [pokeloop](https://github.com/zozo123/pokeloop) · [meta-harness-on-islo](https://github.com/zozo123/meta-harness-on-islo)
- Movie — [agentreel](https://github.com/islo-labs/agentreel)
MIT.