{"id":29439962,"url":"https://github.com/NVIDIA-NeMo/RL","last_synced_at":"2025-07-13T10:02:04.781Z","repository":{"id":293683489,"uuid":"949546994","full_name":"NVIDIA-NeMo/RL","owner":"NVIDIA-NeMo","description":"Scalable toolkit for efficient model reinforcement","archived":false,"fork":false,"pushed_at":"2025-07-11T06:20:34.000Z","size":13236,"stargazers_count":491,"open_issues_count":165,"forks_count":66,"subscribers_count":54,"default_branch":"main","last_synced_at":"2025-07-11T06:40:50.582Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://docs.nvidia.com/nemo/rl/latest/index.html","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NVIDIA-NeMo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-16T17:43:21.000Z","updated_at":"2025-07-11T01:55:57.000Z","dependencies_parsed_at":"2025-05-30T23:32:41.267Z","dependency_job_id":"c5ad1c32-e4e6-4ab0-8e81-aa863befe55e","html_url":"https://github.com/NVIDIA-NeMo/RL","commit_stats":null,"previous_names":["nvidia/nemo-rl","nvidia-nemo/rl"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/NVIDIA-NeMo/RL","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-NeMo%2FRL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-NeMo%2FRL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-NeMo%2FRL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-NeMo%2FRL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NVIDIA-NeMo","download_url":"https://codeload.github.com/NVIDIA-NeMo/RL/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA-NeMo%2FRL/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265122420,"owners_count":23714547,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-13T10:01:38.375Z","updated_at":"2025-07-13T10:02:04.775Z","avatar_url":"https://github.com/NVIDIA-NeMo.png","language":"Python","readme":"# Nemo RL: A Scalable and Efficient Post-Training Library\n\n\u003c!-- markdown all in one --\u003e\n- [Nemo RL: A Scalable and Efficient Post-Training Library](#nemo-rl-a-scalable-and-efficient-post-training-library)\n  - [📣 News](#-news)\n  - [Features](#features)\n  - [Prerequisites](#prerequisites)\n  - [Training Backends](#training-backends)\n  - [GRPO](#grpo)\n    - [GRPO Single Node](#grpo-single-node)\n    - [GRPO Multi-node](#grpo-multi-node)\n      - [GRPO Qwen2.5-32B](#grpo-qwen25-32b)\n      - [GRPO Multi-Turn](#grpo-multi-turn)\n  - [Supervised Fine-Tuning (SFT)](#supervised-fine-tuning-sft)\n    - [SFT Single Node](#sft-single-node)\n    - [SFT Multi-node](#sft-multi-node)\n  - [DPO](#dpo)\n    - [DPO Single Node](#dpo-single-node)\n    - [DPO Multi-node](#dpo-multi-node)\n  - [Evaluation](#evaluation)\n    - [Convert Model Format (Optional)](#convert-model-format-optional)\n    - [Run Evaluation](#run-evaluation)\n  - [Set Up Clusters](#set-up-clusters)\n  - [Tips and Tricks](#tips-and-tricks)\n  - [Citation](#citation)\n  - [Contributing](#contributing)\n  - [Licenses](#licenses)\n\n**Nemo RL** is a scalable and efficient post-training library designed for models ranging from 1 GPU to thousands, and from tiny to over 100 billion parameters.\n\nWhat you can expect:\n\n- **Seamless integration with Hugging Face** for ease of use, allowing users to leverage a wide range of pre-trained models and tools.\n- **High-performance implementation with Megatron Core**, supporting various parallelism techniques for large models (\u003e100B) and large context lengths.\n- **Efficient resource management using Ray**, enabling scalable and flexible deployment across different hardware configurations.\n- **Flexibility** with a modular design that allows easy integration and customization.\n- **Comprehensive documentation** that is both detailed and user-friendly, with practical examples.\n\n## 📣 News\n* [5/14/2025] [Reproduce DeepscaleR with NeMo RL!](docs/guides/grpo-deepscaler.md)\n* [5/14/2025] [Release v0.2.1!](https://github.com/NVIDIA-NeMo/RL/releases/tag/v0.2.1)\n    * 📊 View the release run metrics on [Google Colab](https://colab.research.google.com/drive/1o14sO0gj_Tl_ZXGsoYip3C0r5ofkU1Ey?usp=sharing) to get a head start on your experimentation.\n\n## Features\n\n✅ _Available now_ | 🔜 _Coming in v0.3_\n\n- ✅ **Fast Generation** - vLLM backend for optimized inference.\n- ✅ **HuggingFace Integration** - Works with 1-32B models (Qwen2.5, Llama).\n- ✅ **Distributed Training** - Fully Sharded Data Parallel (FSDP) support and Ray-based infrastructure.\n- ✅ **Environment Support** - Support for multi-environment training.\n- ✅ **Learning Algorithms** - GRPO (Group Relative Policy Optimization), SFT (Supervised Fine-Tuning), and DPO (Direct Preference Optimization).\n- ✅ **Multi-Turn RL** - Multi-turn generation and training for RL with tool use, games, etc.\n- ✅ **Large Model Support** - Native PyTorch support for models up to 32B parameters.\n- ✅ **Advanced Parallelism** - PyTorch native FSDP2, TP, and SP for efficient training.\n- ✅ **Worker Isolation** - Process isolation between RL Actors (no worries about global state).\n- ✅ **Environment Isolation** - Dependency isolation between components.\n- ✅ **(even) Larger Model Support with Long(er) Sequence** - Support advanced parallelism in training with Megatron Core.\n- ✅ **Megatron Inference** - (static) Megatron Inference for day-0 support for new megatron models.\n\n- 🔜 **Improved Native Performance** - Improve training time for Native Pytorch Models.\n- 🔜 **MoE Models** - Support DeepseekV3 and Llama4.\n- 🔜 **Megatron Inference** - (dynamic) Megatron Inference for fast day-0 support for new megatron models.\n\n## Prerequisites\n\nClone **NeMo RL**.\n```sh\ngit clone git@github.com:NVIDIA-NeMo/RL.git nemo-rl\ncd nemo-rl\n\n# If you are using the Megatron backend, download the pinned versions of Megatron-LM and NeMo submodules \n# by running (This is not necessary if you are using the pure Pytorch/DTensor path):\ngit submodule update --init --recursive\n\n# Different branches of the repo can have different pinned versions of these third-party submodules. Ensure\n# submodules are automatically updated after switching branches or pulling updates by configuring git with:\n# git config submodule.recurse true\n\n# **NOTE**: this setting will not download **new** or remove **old** submodules with the branch's changes.\n# You will have to run the full `git submodule update --init --recursive` command in these situations.\n```\n\nIf you are using the Megatron backend on bare-metal (outside of a container), you may\nneed to install the cudnn headers as well. Here is how you can check as well as install them:\n```sh\n# Check if you have libcudnn installed\ndpkg -l | grep cudnn.*cuda\n\n# Find the version you need here: https://developer.nvidia.com/cudnn-downloads?target_os=Linux\u0026target_arch=x86_64\u0026Distribution=Ubuntu\u0026target_version=20.04\u0026target_type=deb_network\n# As an example, these are the \"Linux Ubuntu 20.04 x86_64\" instructions\nwget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb\nsudo dpkg -i cuda-keyring_1.1-1_all.deb\nsudo apt-get update\nsudo apt-get install cudnn-cuda-12\n```\n\nInstall `uv`.\n```sh\n# For faster setup and environment isolation, we use `uv`\npip install uv\n\n# Initialize NeMo RL project virtual environment\n# NOTE: Please do not use -p/--python and instead allow uv venv to read it from .python-version\n#       This ensures that the version of python used is always what we prescribe.\nuv venv\n\n# If working outside a container, it can help to build flash-attn and warm the\n# uv cache before your first run. The NeMo RL Dockerfile will warm the uv cache\n# with flash-attn. See https://docs.nvidia.com/nemo/rl/latest/docker.html for\n# instructions if you are looking for the NeMo RL container.\nbash tools/build-flash-attn-in-uv-cache.sh\n# If sucessful, you should see \"✅ flash-attn successfully added to uv cache\"\n\n# If you cannot install at the system level, you can install for your user with\n# pip install --user uv\n\n# Use `uv run` to launch all commands. It handles pip installing implicitly and\n# ensures your environment is up to date with our lock file.\n\n# Note that it is not recommended to activate the venv and instead use `uv run` since\n# it ensures consistent environment usage across different shells and sessions.\n# Example: uv run python examples/run_grpo_math.py\n```\n\n**Important Notes:**\n\n- Use the `uv run \u003ccommand\u003e` to execute scripts within the managed environment. This helps maintain consistency across different shells and sessions.\n- Ensure you have the necessary CUDA drivers and PyTorch installed compatible with your hardware.\n- On the first install, `flash-attn` can take a while to install (~45min with 48 CPU hyperthreads). After it is built once, it is cached in your `uv`'s cache dir making subsequent installs much quicker.\n- **Reminder**: Don't forget to set your `HF_HOME`, `WANDB_API_KEY`, and `HF_DATASETS_CACHE` (if needed). You'll need to do a `huggingface-cli login` as well for Llama models.\n\n## Training Backends\n\nNeMo RL supports multiple training backends to accommodate different model sizes and hardware configurations:\n\n- **DTensor (FSDP2)** - PyTorch's next-generation distributed training with improved memory efficiency\n- **Megatron** - NVIDIA's high-performance training framework for scaling to large models (\u003e100B parameters)\n\nThe training backend is automatically determined based on your YAML configuration settings. For detailed information on backend selection, configuration, and examples, see the [Training Backends documentation](docs/design-docs/training-backends.md).\n\n## GRPO\n\nWe have a reference GRPO experiment config set up trained for math benchmarks using the [OpenInstructMath2](https://huggingface.co/datasets/nvidia/OpenMathInstruct-2) dataset.\n\n### GRPO Single Node\n\nTo run GRPO on a single GPU for `Qwen/Qwen2.5-1.5B`:\n\n```sh\n# Run the GRPO math example using a 1B parameter model\nuv run python examples/run_grpo_math.py\n```\n\nBy default, this uses the configuration in `examples/configs/grpo_math_1B.yaml`. You can customize parameters with command-line overrides. For example, to run on 8 GPUs,\n\n```sh\n# Run the GRPO math example using a 1B parameter model using 8 GPUs\nuv run python examples/run_grpo_math.py \\\n  cluster.gpus_per_node=8\n```\n\nYou can override any of the parameters listed in the yaml configuration file. For example,\n\n```sh\nuv run python examples/run_grpo_math.py \\\n  policy.model_name=\"meta-llama/Llama-3.2-1B-Instruct\" \\\n  checkpointing.checkpoint_dir=\"results/llama1b_math\" \\\n  logger.wandb_enabled=True \\\n  logger.wandb.name=\"grpo-llama1b_math\" \\\n  logger.num_val_samples_to_print=10\n```\n\nThe default configuration uses the DTensor training backend. We also provide a config `examples/configs/grpo_math_1B_megatron.yaml` which is set up to use the Megatron backend out of the box.\n\nTo train using this config on a single GPU:\n\n```sh\n# Run a GRPO math example on 1 GPU using the Megatron backend\nuv run python examples/run_grpo_math.py \\\n  --config examples/configs/grpo_math_1B_megatron.yaml\n```\n\nFor additional details on supported backends and how to configure the training backend to suit your setup, refer to the [Training Backends documentation](docs/design-docs/training-backends.md).\n\n### GRPO Multi-node\n\n```sh\n# Run from the root of NeMo RL repo\nNUM_ACTOR_NODES=2\n\n# grpo_math_8b uses Llama-3.1-8B-Instruct model\nCOMMAND=\"uv run ./examples/run_grpo_math.py --config examples/configs/grpo_math_8B.yaml cluster.num_nodes=2 checkpointing.checkpoint_dir='results/llama8b_2nodes' logger.wandb_enabled=True logger.wandb.name='grpo-llama8b_math'\" \\\nCONTAINER=YOUR_CONTAINER \\\nMOUNTS=\"$PWD:$PWD\" \\\nsbatch \\\n    --nodes=${NUM_ACTOR_NODES} \\\n    --account=YOUR_ACCOUNT \\\n    --job-name=YOUR_JOBNAME \\\n    --partition=YOUR_PARTITION \\\n    --time=4:0:0 \\\n    --gres=gpu:8 \\\n    ray.sub\n```\nThe required `CONTAINER` can be built by following the instructions in the [Docker documentation](docs/docker.md).\n\n#### GRPO Qwen2.5-32B\n\nThis section outlines how to run GRPO for Qwen2.5-32B with a 16k sequence length.\n```sh\n# Run from the root of NeMo RL repo\nNUM_ACTOR_NODES=16\n\n# Download Qwen before the job starts to avoid spending time downloading during the training loop\nHF_HOME=/path/to/hf_home huggingface-cli download Qwen/Qwen2.5-32B\n\n# Ensure HF_HOME is included in your MOUNTS\nHF_HOME=/path/to/hf_home \\\nCOMMAND=\"uv run ./examples/run_grpo_math.py --config examples/configs/grpo_math_8B.yaml policy.model_name='Qwen/Qwen2.5-32B' policy.generation.vllm_cfg.tensor_parallel_size=4 policy.max_total_sequence_length=16384 cluster.num_nodes=${NUM_ACTOR_NODES} policy.dtensor_cfg.enabled=True policy.dtensor_cfg.tensor_parallel_size=8 policy.dtensor_cfg.sequence_parallel=True policy.dtensor_cfg.activation_checkpointing=True policy.dynamic_batching.train_mb_tokens=16384 policy.dynamic_batching.logprob_mb_tokens=32768 checkpointing.checkpoint_dir='results/qwen2.5-32b' logger.wandb_enabled=True logger.wandb.name='qwen2.5-32b'\" \\\nCONTAINER=YOUR_CONTAINER \\\nMOUNTS=\"$PWD:$PWD\" \\\nsbatch \\\n    --nodes=${NUM_ACTOR_NODES} \\\n    --account=YOUR_ACCOUNT \\\n    --job-name=YOUR_JOBNAME \\\n    --partition=YOUR_PARTITION \\\n    --time=4:0:0 \\\n    --gres=gpu:8 \\\n    ray.sub\n```\n\n#### GRPO Multi-Turn\n\nWe also support multi-turn generation and training (tool use, games, etc.).\nReference example for training to play a Sliding Puzzle Game:\n```sh\nuv run python examples/run_grpo_sliding_puzzle.py\n```\n\n## Supervised Fine-Tuning (SFT)\n\nWe provide an example SFT experiment using the [SQuAD dataset](https://rajpurkar.github.io/SQuAD-explorer/).\n\n### SFT Single Node\n\nThe default SFT configuration is set to run on a single GPU. To start the experiment:\n\n```sh\nuv run python examples/run_sft.py\n```\n\nThis fine-tunes the `Llama3.2-1B` model on the SQuAD dataset using a 1 GPU.\n\nTo use multiple GPUs on a single node, you can modify the cluster configuration. This adjustment will also let you potentially increase the model and batch size:\n\n```sh\nuv run python examples/run_sft.py \\\n  policy.model_name=\"meta-llama/Meta-Llama-3-8B\" \\\n  policy.train_global_batch_size=128 \\\n  sft.val_global_batch_size=128 \\\n  cluster.gpus_per_node=8\n```\n\nRefer to `examples/configs/sft.yaml` for a full list of parameters that can be overridden.\n\n### SFT Multi-node\n\n```sh\n# Run from the root of NeMo RL repo\nNUM_ACTOR_NODES=2\n\nCOMMAND=\"uv run ./examples/run_sft.py --config examples/configs/sft.yaml cluster.num_nodes=2 cluster.gpus_per_node=8 checkpointing.checkpoint_dir='results/sft_llama8b_2nodes' logger.wandb_enabled=True logger.wandb.name='sft-llama8b'\" \\\nCONTAINER=YOUR_CONTAINER \\\nMOUNTS=\"$PWD:$PWD\" \\\nsbatch \\\n    --nodes=${NUM_ACTOR_NODES} \\\n    --account=YOUR_ACCOUNT \\\n    --job-name=YOUR_JOBNAME \\\n    --partition=YOUR_PARTITION \\\n    --time=4:0:0 \\\n    --gres=gpu:8 \\\n    ray.sub\n```\n\n## DPO\n\nWe provide a sample DPO experiment that uses the [HelpSteer3 dataset](https://huggingface.co/datasets/nvidia/HelpSteer3) for preference-based training.\n\n### DPO Single Node\n\nThe default DPO experiment is configured to run on a single GPU. To launch the experiment:\n\n```sh\nuv run python examples/run_dpo.py\n```\n\nThis trains `Llama3.2-1B-Instruct` on one GPU.\n\nIf you have access to more GPUs, you can update the experiment accordingly. To run on 8 GPUs, we update the cluster configuration and switch to an 8B Llama3.1 Instruct model:\n\n```sh\nuv run python examples/run_dpo.py \\\n  policy.model_name=\"meta-llama/Llama-3.1-8B-Instruct\" \\\n  policy.train_global_batch_size=256 \\\n  cluster.gpus_per_node=8\n```\n\nAny of the DPO parameters can be customized from the command line. For example:\n\n```sh\nuv run python examples/run_dpo.py \\\n  dpo.sft_loss_weight=0.1 \\\n  dpo.preference_average_log_probs=True \\\n  checkpointing.checkpoint_dir=\"results/llama_dpo_sft\" \\\n  logger.wandb_enabled=True \\\n  logger.wandb.name=\"llama-dpo-sft\"\n```\n\nRefer to `examples/configs/dpo.yaml` for a full list of parameters that can be overridden. For an in-depth explanation of how to add your own DPO dataset, refer to the [DPO documentation](docs/guides/dpo.md).\n\n### DPO Multi-node\n\nFor distributed DPO training across multiple nodes, modify the following script for your use case:\n\n```sh\n# Run from the root of NeMo RL repo\n## number of nodes to use for your job\nNUM_ACTOR_NODES=2\n\nCOMMAND=\"uv run ./examples/run_dpo.py --config examples/configs/dpo.yaml cluster.num_nodes=2 cluster.gpus_per_node=8 dpo.val_global_batch_size=32 checkpointing.checkpoint_dir='results/dpo_llama81_2nodes' logger.wandb_enabled=True logger.wandb.name='dpo-llama1b'\" \\\nRAY_DEDUP_LOGS=0 \\\nCONTAINER=YOUR_CONTAINER \\\nMOUNTS=\"$PWD:$PWD\" \\\nsbatch \\\n    --nodes=${NUM_ACTOR_NODES} \\\n    --account=YOUR_ACCOUNT \\\n    --job-name=YOUR_JOBNAME \\\n    --partition=YOUR_PARTITION \\\n    --time=4:0:0 \\\n    --gres=gpu:8 \\\n    ray.sub\n```\n\n## Evaluation\n\nWe provide evaluation tools to assess model capabilities.\n\n### Convert Model Format (Optional)\n\nIf you have trained a model and saved the checkpoint in the Pytorch DCP format, you first need to convert it to the Hugging Face format before running evaluation:\n\n```sh\n# Example for a GRPO checkpoint at step 170\nuv run python examples/convert_dcp_to_hf.py \\\n    --config results/grpo/step_170/config.yaml \\\n    --dcp-ckpt-path results/grpo/step_170/policy/weights/ \\\n    --hf-ckpt-path results/grpo/hf\n```\n\u003e **Note:** Adjust the paths according to your training output directory structure.\n\nFor an in-depth explanation of checkpointing, refer to the [Checkpointing documentation](docs/design-docs/checkpointing.md).\n\n### Run Evaluation\n\nRun evaluation script with converted model:\n\n```sh\nuv run python examples/run_eval.py generation.model_name=$PWD/results/grpo/hf\n```\n\nRun evaluation script with custom settings:\n\n```sh\n# Example: Evaluation of DeepScaleR-1.5B-Preview on MATH-500 using 8 GPUs\n#          Pass@1 accuracy averaged over 16 samples for each problem\nuv run python examples/run_eval.py \\\n    generation.model_name=agentica-org/DeepScaleR-1.5B-Preview \\\n    generation.temperature=0.6 \\\n    generation.top_p=0.95 \\\n    generation.vllm_cfg.max_model_len=32768 \\\n    data.dataset_name=HuggingFaceH4/MATH-500 \\\n    data.dataset_key=test \\\n    eval.num_tests_per_prompt=16 \\\n    cluster.gpus_per_node=8\n```\n\u003e **Note:** Evaluation results may vary slightly due to various factors, such as sampling parameters, random seed, inference engine version, and inference engine settings.\n\nRefer to `examples/configs/evals/eval.yaml` for a full list of parameters that can be overridden. For an in-depth explanation of evaluation, refer to the [Evaluation documentation](docs/guides/eval.md).\n\n## Set Up Clusters\n\nFor detailed instructions on how to set up and launch NeMo RL on Slurm or Kubernetes clusters, please refer to the dedicated [Cluster Start](docs/cluster.md) documentation.\n\n## Tips and Tricks\n- If you forget to initialize the NeMo and Megatron submodules when cloning the NeMo-RL repository, you may run into an error like this:\n  \n  ```sh\n  ModuleNotFoundError: No module named 'megatron'\n  ```\n  \n  If you see this error, there is likely an issue with your virtual environments. To fix this, first intialize the submodules:\n\n  ```sh\n  git submodule update --init --recursive\n  ```\n\n  and then force a rebuild of the virutal environments by setting `NRL_FORCE_REBUILD_VENVS=true` next time you launch a run:\n\n  ```sh\n  NRL_FORCE_REBUILD_VENVS=true uv run examples/run_grpo.py ...\n  ```\n\n## Citation\n\nIf you use NeMo RL in your research, please cite it using the following BibTeX entry:\n\n```bibtex\n@misc{nemo-rl,\ntitle = {NeMo RL: A Scalable and Efficient Post-Training Library},\nhowpublished = {\\url{https://github.com/NVIDIA-NeMo/RL}},\nyear = {2025},\nnote = {GitHub repository},\n}\n```\n\n## Contributing\n\nWe welcome contributions to NeMo RL\\! Please see our [Contributing Guidelines](https://github.com/NVIDIA-NeMo/RL/blob/main/CONTRIBUTING.md) for more information on how to get involved.\n\n## Licenses\n\nNVIDIA NeMo RL is licensed under the [Apache License 2.0](https://github.com/NVIDIA-NeMo/RL/blob/main/LICENSE).\n","funding_links":[],"categories":["Python","Repos","Frameworks","📋 Contents","A01_文本生成_文本对话"],"sub_categories":["Scaling and Open-Source","🛠️ 7. Training \u0026 Fine-tuning Ecosystem","大语言对话模型及数据"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNVIDIA-NeMo%2FRL","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FNVIDIA-NeMo%2FRL","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNVIDIA-NeMo%2FRL/lists"}