An open API service indexing awesome lists of open source software.

https://github.com/Ephemeral182/PosterCraft

Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework
https://github.com/Ephemeral182/PosterCraft

Last synced: 4 months ago
JSON representation

Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework

Awesome Lists containing this project

README

          


🎨 PosterCraft:
Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework

[![arXiv](https://img.shields.io/badge/arXiv-2506.10741-red)](https://arxiv.org/abs/2506.10741)
[![GitHub](https://img.shields.io/badge/GitHub-Repository-blue)](https://github.com/ephemeral182/PosterCraft)
[![HuggingFace](https://img.shields.io/badge/πŸ€—-HuggingFace-yellow)](https://huggingface.co/PosterCraft)
[![Website](https://img.shields.io/badge/🌐-Website-green)](https://ephemeral182.github.io/PosterCraft/)
[![Video](https://img.shields.io/badge/πŸŽ₯-Live_Demo-purple)](https://www.youtube.com/watch?v=92wMU4D7qx0)
[![HF Demo](https://img.shields.io/badge/πŸ€—-HF_Demo-orange)](https://huggingface.co/spaces/Ephemeral182/PosterCraft)

PosterCraft Logo

### [**🌐 Website**](https://ephemeral182.github.io/PosterCraft/) | [**🎯 Demo**](https://github.com/Ephemeral182/PosterCraft) | [**πŸ“„ Paper**](https://arxiv.org/abs/2506.10741) | [**πŸ€— Models**](https://huggingface.co/PosterCraft) | [**πŸ“š Datasets**](https://huggingface.co/PosterCraft) | [**πŸŽ₯ Video**](https://www.youtube.com/watch?v=92wMU4D7qx0) | [**πŸ€— HF Demo**](https://huggingface.co/spaces/Ephemeral182/PosterCraft)

---

## News & Updates

- πŸ–₯️ **[2025.06]** We have pushed our work on [MeiGen-AI](https://github.com/MeiGen-AI), where you can explore not only our project but also the work of other colleagues. Feel free to check it out for more insights and contributions.
- 🧩 **[2025.06]** Community user [@AIFSH](https://github.com/AIFSH) has successfully integrated **PosterCraft into ComfyUI**!
You can check out the full workflow here: [PosterCraft-ComfyUI Example](https://www.xiangongyun.com/image/detail/68b711eb-a31e-47db-82eb-47438359f4bf?r=XLVYLW)
Big thanks to the contributor β€” this will be helpful for many users! See [Issue #6](https://github.com/Ephemeral182/PosterCraft/issues/6) for details.
- πŸ“– **[2025.06]** Our **Chinese article** providing a detailed introduction and technical walkthrough of PosterCraft is now available!
Read it here: [δΈ­ζ–‡θ§£θ―»ο½œι«˜θ΄¨ι‡ηΎŽε­¦ζ΅·ζŠ₯η”Ÿζˆζ‘†ζžΆ PosterCraft](https://mp.weixin.qq.com/s/gq6DwohKP0z333OSDRe7Xw)
- πŸ”₯ **[2025.06]** We have deployed a demo on Hugging Face Space, feel free to give it a try!
- πŸš€ **[2025.06]** Our gradio demo and inference code are now available!
- πŸ“Š **[2025.06]** We have released partial datasets and model weights on HuggingFace.

---

Let me know if this works!

## πŸ‘₯ Authors

> [**Sixiang Chen**](https://ephemeral182.github.io/)1,2\*, [**Jianyu Lai**](https://openreview.net/profile?id=~Jianyu_Lai1)1\*, [**Jialin Gao**](https://scholar.google.com/citations?user=sj4FqEgAAAAJ&hl=zh-CN)2\*, [**Tian Ye**](https://owen718.github.io/)1, [**Haoyu Chen**](https://haoyuchen.com/)1, [**Hengyu Shi**](https://openreview.net/profile?id=%7EHengyu_Shi1)2, [**Shitong Shao**](https://shaoshitong.github.io/)1, [**Yunlong Lin**](https://scholar.google.com.hk/citations?user=5F3tICwAAAAJ&hl=zh-CN)3, [**Song Fei**](https://openreview.net/profile?id=~Song_Fei1)1, [**Zhaohu Xing**](https://ge-xing.github.io/)1, [**Yeying Jin**](https://jinyeying.github.io/)4, **Junfeng Luo**2, [**Xiaoming Wei**](https://scholar.google.com/citations?user=JXV5yrZxj5MC&hl=zh-CN)2, [**Lei Zhu**](https://sites.google.com/site/indexlzhu/home)1,5†
>
> 1The Hong Kong University of Science and Technology (Guangzhou)
> 2Meituan
> 3Xiamen University
> 4National University of Singapore
> 5The Hong Kong University of Science and Technology
>
> \*Equal Contribution, †Corresponding Author

---

## 🌟 What is PosterCraft?


What is PosterCraft - Quick Prompt Demo


PosterCraft is a unified framework for **high-quality aesthetic poster generation** that excels in **precise text rendering**, **seamless integration of abstract art**, **striking layouts**, and **stylistic harmony**.

## πŸš€ Quick Start

### πŸ”§ Installation

```bash
# Clone the repository
git clone https://github.com/ephemeral182/PosterCraft.git
cd PosterCraft

# Create conda environment
conda create -n postercraft python=3.11
conda activate postercraft

# Install dependencies
pip install -r requirements.txt

```

### πŸš€ Quick Generation

Generate high-quality aesthetic posters from your prompt with `BF16` precision:

```bash
python inference.py \
--prompt "Urban Canvas Street Art Expo poster with bold graffiti-style lettering and dynamic colorful splashes" \
--enable_recap \
--num_inference_steps 28 \
--guidance_scale 3.5 \
--seed 42 \
--pipeline_path "black-forest-labs/FLUX.1-dev" \
--custom_transformer_path "PosterCraft/PosterCraft-v1_RL" \
--qwen_model_path "Qwen/Qwen3-8B"
```

If you are running on a GPU with limited memory, you can use `inference_offload.py` to offload some components to the CPU:

```bash
python inference_offload.py \
--prompt "Urban Canvas Street Art Expo poster with bold graffiti-style lettering and dynamic colorful splashes" \
--enable_recap \
--num_inference_steps 28 \
--guidance_scale 3.5 \
--seed 42 \
--pipeline_path "black-forest-labs/FLUX.1-dev" \
--custom_transformer_path "PosterCraft/PosterCraft-v1_RL" \
--qwen_model_path "Qwen/Qwen3-8B"
```

### πŸ’» Gradio Web UI

We provide a Gradio web UI for PosterCraft.

```bash
python demo_gradio.py
```

## πŸ“Š Performance Benchmarks

### πŸ“ˆ Quantitative Results


Method
Text Recall ↑
Text F-score ↑
Text Accuracy ↑


OpenCOLE (Open)
0.082
0.076
0.061


Playground-v2.5 (Open)
0.157
0.146
0.132


SD3.5 (Open)
0.565
0.542
0.497


Flux1.dev (Open)
0.723
0.707
0.667


Ideogram-v2 (Close)
0.711
0.685
0.680


BAGEL (Open)
0.543
0.536
0.463


Gemini2.0-Flash-Gen (Close)
0.798
0.786
0.746


PosterCraft (ours)
0.787
0.774
0.735

User Study Results

---

## 🎭 Gallery & Examples

### 🎨 PosterCraft Gallery


Adventure Travel

Post-Apocalyptic

Sci-Fi Drama


Space Thriller

Cultural Event

Luxury Product


Concert Show

Children's Book

Movie Poster

---

## πŸ—οΈ Model Architecture


PosterCraft Framework Overview


A unified framework for high-quality aesthetic poster generation

Our unified framework consists of **four critical optimization stages in the training workflow**:

### πŸ”€ Stage 1: Text Rendering Optimization
Addresses accurate text generation by precisely rendering diverse text on high-quality backgrounds, also ensuring faithful background representation and establishing foundational fidelity and robustness for poster generation.

### 🎨 Stage 2: High-quality Poster Fine-tuning
Shifts focus to overall poster style and text-background harmony using Region-aware Calibration. This fine-tuning stage preserves text accuracy while strengthening the artistic integrity of the aesthetic poster.

### 🎯 Stage 3: Aesthetic-Text RL
Employs Aesthetic-Text Preference Optimization to capture higher-order aesthetic trade-offs. This reinforcement learning stage prioritizes outputs that satisfy holistic aesthetic criteria and mitigates defects in font rendering.

### πŸ”„ Stage 4: Vision-Language Feedback
Introduces a Joint Vision-Language Conditioning mechanism. This iterative feedback combines visual information with targeted text suggestions for multi-modal corrections, progressively refining aesthetic content and background harmony.

---

## πŸ’Ύ Model Zoo

We provide the weights for our core models, fine-tuned at different stages of the PosterCraft pipeline.

Model
Stage
Description
Download

🎯 PosterCraft-v1_RL
Stage 3: Aesthetic-Text RL
Optimized via Aesthetic-Text Preference Optimization for higher-order aesthetic trade-offs.
πŸ€— HF

πŸ”„ PosterCraft-v1_Reflect
Stage 4: Vision-Language Feedback
Iteratively refined using vision-language feedback for further harmony and content accuracy.
πŸ€— HF

---

## πŸ“š Datasets

We provide **four specialized datasets** for training PosterCraft workflow:

### πŸ”€ Text-Render-2M


Text-Render-2M Dataset


Text-Render-2M: Multi-instance text rendering with diverse selections

A comprehensive text rendering dataset containing **2 million high-quality examples**. Features multi-instance text rendering, diverse text selections (varying in size, count, placement, and rotation), and dynamic content generation through both template-based and random string approaches.

### 🎨 HQ-Poster-100K


HQ-Poster-100K Dataset


HQ-Poster-100K: Curated high-quality aesthetic posters

**100,000** meticulously curated high-quality posters with advanced filtering techniques and multi-modal scoring. Features Gemini-powered mask generation with detailed captions for comprehensive poster understanding.

### πŸ‘ Poster-Preference-100K


Poster-Preference-100K Dataset


Poster-Preference-100K: Preference learning pairs for aesthetic optimization

This preference dataset is sourced from over **100,000** generated poster images. Through comprehensive evaluation by Gemini and aesthetic evaluators, we construct high-quality preference pairs designed for reinforcement learning to align poster generation with human aesthetic judgments.

### πŸ”„ Poster-Reflect-120K


Poster-Reflect-120K Dataset


Poster-Reflect-120K: Vision-language feedback pairs for iterative refinement

This vision-language feedback dataset is sourced from over **120,000** generated poster images. Through comprehensive evaluation by Gemini and aesthetic evaluators, this dataset captures the iterative refinement process and provides detailed feedback for further improvements.

Dataset
Size
Description
Download

πŸ”€ Text-Render-2M
2M samples
High-quality text rendering examples with multi-instance support
πŸ€— HF

🎨 HQ-Poster-100K
100K samples
Curated high-quality posters with aesthetic evaluation
πŸ€— HF

πŸ‘ Poster-Preference-100K
100K images
Preference learning poster pairs for RL training
πŸ€— HF

πŸ”„ Poster-Reflect-120K
120K images
Vision-language feedback pairs for iterative refinement
πŸ€— HF

---

## πŸ“ Citation

If you find PosterCraft useful for your research, please cite our paper:

```bibtex
@article{chen2025postercraft,
title={PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework},
author={Chen, Sixiang and Lai, Jianyu and Gao, Jialin and Ye, Tian and Chen, Haoyu and Shi, Hengyu and Shao, Shitong and Lin, Yunlong and Fei, Song and Xing, Zhaohu and Jin, Yeying and Luo, Junfeng and Wei, Xiaoming and Zhu, Lei},
journal={arXiv preprint arXiv:2506.10741},
year={2025}
}
```

---

## πŸ™ Acknowledgments

- πŸ›οΈ Thanks to our affiliated institutions for their support.
- 🀝 Special thanks to the open-source community for inspiration.

---

## πŸ“¬ Contact

For any questions or inquiries, please reach out to us:

- **Sixiang Chen**: `schen691@connect.hkust-gz.edu.cn`
- **Jianyu Lai**: `jlai218@connect.hkust-gz.edu.cn`