https://ntu-ai4x.github.io/ConceptSeg-R1/
Segment Any Concept via Meta-Reinforcement Learning
https://ntu-ai4x.github.io/ConceptSeg-R1/
concept-segmentation generalized-concept-segmentation image-segmentation image-segmentation-pytorch object-segmentation unified-concept-segmentation unified-image-segmentation
Last synced: 3 days ago
JSON representation
Segment Any Concept via Meta-Reinforcement Learning
- Host: GitHub
- URL: https://ntu-ai4x.github.io/ConceptSeg-R1/
- Owner: NTU-AI4X
- Created: 2026-05-17T06:50:55.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-06-03T09:27:19.000Z (25 days ago)
- Last Synced: 2026-06-13T14:28:09.160Z (15 days ago)
- Topics: concept-segmentation, generalized-concept-segmentation, image-segmentation, image-segmentation-pytorch, object-segmentation, unified-concept-segmentation, unified-image-segmentation
- Language: Python
- Homepage:
- Size: 32 MB
- Stars: 235
- Watchers: 13
- Forks: 19
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- Awesome-Segment-Anything - [project
README
ConceptSeg-R1
**ConceptSeg-R1: Segment Any Concept via Meta-Reinforcement Learning**
[](https://arxiv.org/pdf/2605.20385)
[](https://ntu-ai4x.github.io/ConceptSeg-R1/)
[](https://huggingface.co/zhaoyuan666/ConceptSeg-R1-7B)
[](https://huggingface.co/datasets/zhaoyuan666/ConceptSeg-Benchmark)
[](LICENSE)
[](https://github.com/yuanzhao-CVLAB/ConceptSeg-R1/stargazers)
Introduction β’
Get Started β’
Data β’
Checkpoints
## π° News
- **May 2026** β arXiv paper released π
## πΊοΈ Roadmap
| Status | Item |
|:------:|------|
| β
| arXiv paper |
| β
| Training code |
| β
| Testing code |
| β
| CI-CD-CR datasets |
| β
| ConceptSeg-R1 (7B weights) |
| β¬ | Release larger MLLM weights, e.g., ConceptSeg-R1-32BοΌConceptSeg-R1-72B|
## Introduction
### π As segmentation in computer vision shifts from objects to concepts,
### π **ConceptSeg-R1 takes the first step toward segmenting any concept.**
### Key Contributions
- **π³ From Objects to Concepts**
We introduce a three-level concept hierarchy covering **CI**, **CD**, and **CR** concepts, pushing segmentation beyond category recognition.
- **π From Instance Solving to Rule Induction**
Meta-GRPO enables the model to infer transferable task rules from visual demonstrations and apply them deductively to unseen queries.
- **π Latent Concept Tokens for Frozen SAM 3**
We map MLLM reasoning states into implicit concept tokens in the SAM 3 prompt space, enabling reasoning-aware segmentation without fine-tuning SAM 3.
- **β‘ From Heavy Reasoning to Adaptive Inference**
The Shortcut Router dynamically balances SAM 3 efficiency and reasoning depth, enabling fast perception for simple cases and deeper reasoning for complex concepts.
## Results
### Concept Segmentation Benchmarks (CI / CD / CR)
### Cityscapes Performance (Zero-Shot)
### ReasonSeg Performance (Zero-Shot)
### Qualitative Comparison
### Concept Coexistence
## Get Started
1. Environment Setup
### 1. Environment Setup
Before running `setup.sh`, download the release assets below from
[GitHub Releases](https://github.com/yuanzhao-CVLAB/ConceptSeg-R1/releases)
and place them in the repository root:
- `sam3-main.zip`: the modified SAM 3 package used by ConceptSeg-R1.
- `all_meta.json.zip`: the training metadata file.
```bash
conda create -n conceptseg-r1 python=3.10
conda activate conceptseg-r1
bash setup.sh
```
2. Training
### 2. Training
**Prepare data** β Download the dataset, extract `all_meta.json` through `setup.sh`,
and set your `image_folders` path in the shell scripts.
```bash
# Stage 1: SFT Training
bash run_grpo_multiimage_stage1.sh
# Stage 2: GRPO Training
# Note: Set `model_path` to the Stage 1 output checkpoint before running. οΌIf you training encounter unexpected GPU OOM despite sufficient VRAM, try changing transformers_version to "4.49.0" in model_path/generation_config.json.οΌ
bash run_grpo_multiimage_stage2.sh
```
3. Evaluation
### 3. Evaluation
**Concept Segmentation** β Download weights, set the model path in `eval_conceptseg.sh`, then run:
```bash
bash eval_conceptseg.sh
```
> **Tip:** Configure specific tasks for testing inside `eval_conceptseg.sh`.
**Reasoning Segmentation** β Download weights, set the model path in `eval_reasonseg.sh`, then run:
```bash
bash eval_reasonseg.sh
```
4. Inference
### 4. Inference
**Quick Start**: The `inference.sh` script includes 4 test cases covering different usage scenarios.
```bash
# Test 4 cases
bash run_scripts/inference.sh
```
**Single Example Inference** β For quick testing and demonstration, use the inference script:
```bash
# Or test a specific case
python src/eval/inference_single_example.py \
--model_path "path/to/model" \
--infer_path "path/to/image" \
--question "concept description" \
--output_path "output/path"
```
**Supported Input Modes:**
- **Single Image**: Basic concept segmentation with text prompt (set `--ref_path` and `--bbox` to empty)
- **Multiple Images**: Reference-guided segmentation with visual reasoning (set `--ref_path)
- **Bounding Boxes**: Precise reference region specification for complex concepts (set `--bbox)
## Data
`all_meta.json` is no longer tracked in this repository. Download
`all_meta.json.zip` from
[GitHub Releases](https://github.com/yuanzhao-CVLAB/ConceptSeg-R1/releases)
and run `bash setup.sh` to extract it before training.
Place datasets under a shared root directory (`image_folders`):
```
root/
βββ isic2018/
βββ rare/
βββ Breast_Tumor/
βββ transparent1024/
βββ MGrounding-630k/
βββ Polyp/
βββ Shadow_detection/
βββ MIG-Bench/
βββ coco2014_Living/
βββ CoSOD3k1024/
βββ ultra_rare/
βββ coco2014_Artifact/
βββ fewshot1000/
βββ DUTS/
βββ ESDIDefects/
βββ COD10K1024/
```
## Metric
Evaluation uses the [PySegMetric_EvalToolkit](https://github.com/Xiaoqi-Zhao-DLUT/PySegMetric_EvalToolkit).
## Datasets & Checkpoints
| Resource | Link |
|----------|------|
| π¦ ConceptSeg-Benchmark Dataset | [Download on HuggingFace](https://huggingface.co/datasets/zhaoyuan666/ConceptSeg-Benchmark) |
| π€ ConceptSeg-R1-7B Weights | [Download on HuggingFace](https://huggingface.co/zhaoyuan666/ConceptSeg-R1-7B) |
## Acknowledgements
We reference the excellent open-source repos [SAM 3](https://github.com/facebookresearch/sam3), [VLM-R1](https://github.com/om-ai-lab/VLM-R1) and [LENS](https://github.com/hustvl/LENS). Thanks to their authors for the valuable contributions to the community.
## Citation
If you find this work useful, please consider starring β and citing the repo!
```bibtex
@misc{zhao2026conceptseg,
title={ConceptSeg-R1: Segment Any Concept via Meta-Reinforcement Learning},
author={Yuan Zhao and Youwei Pang and Jiaming Zuo and Wei Ji and Kailai Zhou and Bin Fan and Yunkang Cao and Lihe Zhang and Xiaofeng Liu and Huchuan Lu and Weisi Lin and Dacheng Tao and Xiaoqi Zhao},
year={2026},
eprint={2605.20385},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.20385},
}