https://github.com/samacqua/LARC

Language-annotated Abstraction and Reasoning Corpus
https://github.com/samacqua/LARC

arc

Last synced: about 1 year ago
JSON representation

Language-annotated Abstraction and Reasoning Corpus

Host: GitHub
URL: https://github.com/samacqua/LARC
Owner: samacqua
License: other
Created: 2021-01-31T20:19:30.000Z (over 5 years ago)
Default Branch: main
Last Pushed: 2023-05-20T01:12:42.000Z (about 3 years ago)
Last Synced: 2024-11-12T15:43:18.357Z (over 1 year ago)
Topics: arc
Language: JavaScript
Homepage: https://samacquaviva.com/LARC/explore/
Size: 9.28 MB
Stars: 78
Watchers: 4
Forks: 10
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Language-complete Abstraction and Reasoning Corpus (LARC)

This repository contains the LARC dataset and supporting assets

The entire dataset can be browsed at [the explorer interface](https://samacqua.github.io/LARC/explore) take a look!

Here's a [quick 5 minutes slideslive video explaining this work](https://recorder-v3.slideslive.com/#/share?share=75868&s=71a5a42e-dceb-4055-9ae7-9bfccd4ffc1f)

*"How can we build intelligent systems that achieve human-level performance on challenging and structured domains (like ARC), with or without additional human guidance? We posit the answer may be found in studying natural programs - instructions humans give to each other to communicate how to solve a task. Like a computer program, these instructions can be reliably "executed" by others to produce intended outputs."*

A comprehensive view of this dataset and its goals can be found in [Communicating Natural Programs to Humans and Machines (Neurips Dataset and Benchmark, 2022)](https://arxiv.org/abs/2106.07824)

LARC is curated from a communication game, where
one participant, the *describer* solves an [ARC task](https://github.com/fchollet/ARC) and describes the solution to a different participant,
the *builder*, who must solve the task on the new input using the description alone.
The successful descriptions are "language-complete" in a sense that it fully captures the underlying ARC task in the absence of the original input-output examples.

drawing

Citation
```
@article{acquaviva2021communicating,
title={Communicating Natural Programs to Humans and Machines},
author={Acquaviva, Samuel and Pu, Yewen and Kryven, Marta and Wong, Catherine and Ecanow, Gabrielle E and Nye, Maxwell and Sechopoulos, Theodoros and Tessler, Michael Henry and Tenenbaum, Joshua B},
journal={arXiv preprint arXiv:2106.07824},
year={2021}
}
```

The original ARC data can be found here [The Abstraction and Reasoning Corpus](https://github.com/fchollet/ARC)

## Contents
- `dataset` contains the language-complete ARC tasks and successful natural program phrase annotations
- `explorer` contains the explorer code that allows for easy browsing of the annotated tasks
- `collection` contains the source code used to curate the dataset
- `bandit` contains the formulation and environment for bandit algorithm used for collection

language-guided program synthesis code can be found [here](https://github.com/theosech/ec/tree/language-guided_program_synthesis_for_larc)

GPT4 (vision only) program induction results can be found [here](https://github.com/evanthebouncy/larc_gpt4)

## License

The [dataset](https://github.com/samacqua/LARC/tree/main/dataset) is licensed under the [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/)

All supporting code follows the [MIT License](https://opensource.org/licenses/MIT)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/samacqua/LARC

Awesome Lists containing this project

README