An open API service indexing awesome lists of open source software.

https://github.com/cyberagentailab/opencole

OpenCOLE: Towards Reproducible Automatic Graphic Design Generation [Inoue+, CVPRW2024 (GDUG)]
https://github.com/cyberagentailab/opencole

Last synced: 9 months ago
JSON representation

OpenCOLE: Towards Reproducible Automatic Graphic Design Generation [Inoue+, CVPRW2024 (GDUG)]

Awesome Lists containing this project

README

          


OpenCOLE: Towards Reproducible Automatic Graphic Design Generation


Naoto Inoue*
Kento Masui*
Wataru Shimoda*
Kota Yamaguchi  (*: equal contribution)




CyberAgent


Workshop on Graphic Design Understanding and Generation (at CVPR2024)

![alt text](figs/main_results.png)

# Overview

🤔 Automatic generation of graphic designs has recently received considerable attention.

😦 However, the state-of-the-art approaches are **complex** and rely on **proprietary** datasets, which creates reproducibility barriers.

🔥 In this paper, we propose an open framework for automatic graphic design called OpenCOLE, where we build a modified version of the pioneering [COLE [Jia+, arXiv'23]](https://graphic-design-generation.github.io/) and **train our model exclusively on publicly available datasets**.

🚀 Based on GPT4V evaluations, our model shows promising performance comparable to the original COLE. We release the pipeline and training results to encourage **open development**.

# Setup

## Requirements

- [uv](https://astral.sh/blog/uv)
- [direnv](https://github.com/direnv/direnv)

## Install

```
poetry install
```

## Dataset
OpenCOLE dataset (v1) is available at [`cyberagent/opencole`](https://huggingface.co/datasets/cyberagent/opencole) in HuggingFace dataset hub.

## Pre-trained models
- text_to_image: [`cyberagent/opencole-stable-diffusion-xl-base-1.0-finetune`](https://huggingface.co/cyberagent/opencole-stable-diffusion-xl-base-1.0-finetune)
- typography_lmm: [`cyberagent/opencole-typographylmm-llava-v1.5-7b-lora`](https://huggingface.co/cyberagent/opencole-typographylmm-llava-v1.5-7b-lora)

## Environment variables

Some part requires additional environment variables. We recommend to use [direnv](https://direnv.net/).
Please copy the template in [.envrc.example](.envrc.example) and modify it on your own.

```bash
cp .envrc.example .envrc
```

# Inference

Please refer to [inference.md](./docs/inference.md).

# Evaluation

We provide a script for GPT4V-based evaluation on generated images.

```python
uv run python -m opencole.evaluation.eval_gpt4v --input_dir --output_path
```

# Training

Please refer to [training.md](./docs/training.md).

# Citation

If you find this code useful for your research, please cite our paper:

```
@inproceedings{inoue2024opencole,
title={{OpenCOLE: Towards Reproducible Automatic Graphic Design Generation}},
author={Naoto Inoue and Kento Masui and Wataru Shimoda and Kota Yamaguchi},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
year={2024},
}
```

# Acknowledgement
This repository has been migrated from the internal repo. Despite the fact that commit logs are not visible, all the contributors have made significant contributions to the repository.

- [@proboscis](https://github.com/proboscis): OpenCOLE dataset construction
- [@shimoda-uec](https://github.com/shimoda-uec): TypographyLMM
- [@kyamagu](https://github.com/kyamagu): renderer
- [@naoto0804](https://github.com/naoto0804): other (bunch of) stuffs