Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/potamides/DeTikZify

Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ
https://github.com/potamides/DeTikZify

draw graph huggingface inverse-graphics latex llama llm multimodal sketch tikz transformers vectorization visualization

Last synced: about 19 hours ago
JSON representation

Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ

Awesome Lists containing this project

README

        

# DeTi*k*Zify
Synthesizing Graphics Programs for Scientific Figures and Sketches with Ti*k*Z
[![OpenReview](https://img.shields.io/badge/View%20on%20OpenReview-8C1B13?labelColor=gray&logo=)](https://openreview.net/forum?id=bcVLFQCOjc)
[![arXiv](https://img.shields.io/badge/View%20on%20arXiv-B31B1B?logo=arxiv&labelColor=gray)](https://arxiv.org/abs/2405.15306)
[![Hugging Face](https://img.shields.io/badge/View%20on%20Hugging%20Face-blue?labelColor=gray&logo=)](https://huggingface.co/collections/nllg/detikzify-664460c521aa7c2880095a8b)
[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1hPWqucbPGTavNlYvOBvSNBAwdcPZKe8F)

Creating high-quality scientific figures can be time-consuming and challenging,
even though sketching ideas on paper is relatively easy. Furthermore,
recreating existing figures that are not stored in formats preserving semantic
information is equally complex. To tackle this problem, we introduce
[DeTi*k*Zify](https://github.com/potamides/DeTikZify), a novel multimodal
language model that automatically synthesizes scientific figures as
semantics-preserving [Ti*k*Z](https://github.com/pgf-tikz/pgf) graphics
programs based on sketches and existing figures. We also introduce an
MCTS-based inference algorithm that enables DeTi*k*Zify to iteratively refine
its outputs without the need for additional training.

https://github.com/potamides/DeTikZify/assets/53401822/203d2853-0b5c-4a2b-9d09-3ccb65880cd3

## News
* **2024-12-05**: We release [DeTi*k*Zifyv2
(8b)](https://huggingface.co/nllg/detikzify-v2-8b), our latest model which
surpasses all previous versions in our evaluation and make it the new default
model in our [Hugging Face
Space](https://huggingface.co/spaces/nllg/DeTikZify). Check out the [model
card](https://huggingface.co/nllg/detikzify-v2-8b-preview#model-card-for-detikzifyv2-8b)
for more information.
* **2024-09-24**: DeTi*k*Zify was accepted at [NeurIPS
2024](https://neurips.cc/Conferences/2024) as a [spotlight
paper](https://neurips.cc/virtual/2024/poster/94474)!

## Installation

> [!TIP]
> If you encounter difficulties with installation and inference on your own
> hardware, consider visiting our [Hugging Face
> Space](https://huggingface.co/spaces/nllg/DeTikZify) (restarting the space
> might take a few minutes). Should you experience long queues, you have the
> option to
> [duplicate](https://huggingface.co/spaces/nllg/DeTikZify?duplicate=true) it
> with a paid private GPU runtime for a more seamless experience. Additionally,
> you can try our demo on [Google
> Colab](https://colab.research.google.com/drive/1hPWqucbPGTavNlYvOBvSNBAwdcPZKe8F).
> However, setting up the environment there might take some time, and the free
> tier only supports inference for the 1b models.

The Python package of DeTi*k*Zify can be easily installed using
[pip](https://pip.pypa.io/en/stable):
```sh
pip install 'detikzify[legacy] @ git+https://github.com/potamides/DeTikZify'
```
The `[legacy]` extra is only required if you plan to use the
DeTi*k*Zifyv1 models. If you only plan to use
DeTi*k*Zifyv2 you can remove it. If your goal is to run the included
[examples](examples), it is easier to clone the repository and install it in
editable mode like this:
```sh
git clone https://github.com/potamides/DeTikZify
pip install -e DeTikZify[examples]
```
In addition, DeTi*k*Zify requires a full
[TeX Live 2023](https://www.tug.org/texlive) installation,
[ghostscript](https://www.ghostscript.com), and
[poppler](https://poppler.freedesktop.org) which you have to install through
your package manager or via other means.

## Usage

> [!TIP]
> For interactive use and general [usage
> tips](https://github.com/potamides/DeTikZify/tree/main/detikzify/webui#usage-tips),
> we recommend checking out our [web UI](detikzify/webui), which can be started
> directly from the command line (use `--help` for a list of all options):
> ```sh
> python -m detikzify.webui --light
> ```

If all required dependencies are installed, the full range of DeTi*k*Zify
features such as compiling, rendering, and saving Ti*k*Z graphics, and
MCTS-based inference can be accessed through its programming interface:
```python
from operator import itemgetter

from detikzify.model import load
from detikzify.infer import DetikzifyPipeline

image = "https://w.wiki/A7Cc"
pipeline = DetikzifyPipeline(*load(
model_name_or_path="nllg/detikzify-v2-8b",
device_map="auto",
torch_dtype="bfloat16",
))

# generate a single TikZ program
fig = pipeline.sample(image=image)

# if it compiles, rasterize it and show it
if fig.is_rasterizable:
fig.rasterize().show()

# run MCTS for 10 minutes and generate multiple TikZ programs
figs = set()
for score, fig in pipeline.simulate(image=image, timeout=600):
figs.add((score, fig))

# save the best TikZ program
best = sorted(figs, key=itemgetter(0))[-1][1]
best.save("fig.tex")
```
More involved examples, for example for evaluation and training, can be found
in the [examples](examples) folder.

## Model Weights & Datasets
We upload all our models and datasets to the [Hugging Face
Hub](https://huggingface.co/collections/nllg/detikzify-664460c521aa7c2880095a8b).
However, please note that for the public release of the DaTi*k*Zv2
dataset, we had to remove a considerable portion of Ti*k*Z drawings originating
from [arXiv](https://arxiv.org), as the [arXiv non-exclusive
license](https://arxiv.org/licenses/nonexclusive-distrib/1.0/license.html) does
not permit redistribution. We do, however, release our [dataset creation
scripts](https://github.com/potamides/DaTikZ) and encourage anyone to recreate
the full version of DaTi*k*Zv2 themselves.

## Citation
If DeTi*k*Zify has been beneficial for your research or applications, we kindly
request you to acknowledge its use by citing it as follows:

```bibtex
@inproceedings{belouadi2024detikzify,
title={{DeTikZify}: Synthesizing Graphics Programs for Scientific Figures and Sketches with {TikZ}},
author={Jonas Belouadi and Simone Paolo Ponzetto and Steffen Eger},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=bcVLFQCOjc}
}
```

## Acknowledgments
The implementation of the DeTi*k*Zify model architecture is based on
[LLaVA](https://github.com/haotian-liu/LLaVA) and
[AutomaTikZ](https://github.com/potamides/AutomaTikZ) (v1), and [Idefics
3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3) (v2). Our MCTS
implementation is based on
[VerMCTS](https://github.com/namin/llm-verified-with-monte-carlo-tree-search).