https://github.com/google/arc-gen
A Mimetic Procedural Benchmark Generator for the Abstraction and Reasoning Corpus
https://github.com/google/arc-gen
arc-agi artificial-intelligence benchmark-generator program-synthesis
Last synced: 1 day ago
JSON representation
A Mimetic Procedural Benchmark Generator for the Abstraction and Reasoning Corpus
- Host: GitHub
- URL: https://github.com/google/arc-gen
- Owner: google
- License: apache-2.0
- Created: 2025-01-22T18:14:49.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-04-18T23:21:42.000Z (9 days ago)
- Last Synced: 2026-04-26T04:31:08.887Z (1 day ago)
- Topics: arc-agi, artificial-intelligence, benchmark-generator, program-synthesis
- Language: Python
- Homepage:
- Size: 47.6 MB
- Stars: 42
- Watchers: 2
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
This repository contains the source code for *ARC-GEN*, a mimetic procedural benchmark generator for the Abstraction and Reasoning Corpus.
For a more in-depth description of this work, see the [corresponding paper on arxiv](https://arxiv.org/abs/2511.00162).
## News
* `2026-04-04`: ARC-GEN to be used as the official benchmark generator in the [IJCAI-ECAI 2026 NeuroGolf Challenge](https://sites.google.com/view/neurogolf-2026/home).
* `2026-03-25`: ARC-GEN now supports 500 additional tasks from [ARC-AGI-2](https://arcprize.org/arc-agi/2).
* `2025-10-31`: An ARC-GEN overview is now available on [arxiv](https://arxiv.org/abs/2511.00162).
* `2025-07-31`: ARC-GEN to be used as the official benchmark generator in the [2025 Google Code Golf Championship](https://www.kaggle.com/competitions/google-code-golf-2025).
* `2025-05-15`: The initial ARC-GEN repository committed to GitHub.
## Installation
```
$ git clone --recurse-submodules https://github.com/google/ARC-GEN.git && cd ARC-GEN
```
## Usage
For **benchmark generation**, use the `generate` command with two arguments: the task ID, and the desired number of example pairs.
```
$ python3 arc_gen.py generate 1e0a9b12 1000
[{'input': [[4, 0, 0, 0], [0, 0, 0, 0], [4, 0, 8, 0], [0, 3, 8, 0]], 'output': ...
```
For **validation** (i.e., to ensure that the ARC-GEN generators can collectively reproduce the original [ARC-AGI-1](https://github.com/fchollet/ARC-AGI) benchmark suite), use the `validate` command:
```
$ python3 arc_gen.py validate
A total of 400 generators passed.
A total of 0 generators failed.
```
For an example of customized **variations**, refer to [arc_gen_variations.py](https://github.com/google/ARC-GEN/blob/main/arc_gen_variations.py), which produces two variations on [Task #125](https://arcprize.org/play?task=543a7ed5):
```
generator, _ = task_list.task_list().get("543a7ed5")
examples = []
# Two examples of a "large" variation on Task #125.
examples.extend([generator(boxes=8, size=28) for _ in range(2)])
# Two examples of a "large + inverted" variation on Task #125.
common.set_colors([0, 1, 2, 6, 8, 5, 3, 7, 4, 9])
examples.extend([generator(boxes=8, size=28) for _ in range(2)])
```
## The ARC-GEN-100K Dataset
For those seeking a pre-generated dataset of sample pairs, the link below provides a static benchmark suite containing 100,000 examples produced by ARC-GEN (covering all four-hundred tasks):
https://www.kaggle.com/datasets/arcgen100k/the-arc-gen-100k-dataset
## How to Cite?
```
@misc{Moffitt2025,
title={{ARC-GEN: A Mimetic Procedural Benchmark Generator for the Abstraction and Reasoning Corpus}},
author={Michael D. Moffitt},
year={2025},
eprint={2511.00162},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2511.00162},
}
```
## Other Resouces
* [RE-ARC: Reverse-Engineering the Abstraction and Reasoning Corpus](https://github.com/michaelhodel/re-arc) by Michael Hodel
* [Bootstrapping ARC: Synthetic Problem Generation for ARC Visual Reasoning Tasks](https://github.com/xu3kev/BARC) by Wen-Ding Li and others