https://github.com/xichenpan/ARLDM

Official Pytorch Implementation of Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
https://github.com/xichenpan/ARLDM

Last synced: 8 months ago
JSON representation

Official Pytorch Implementation of Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

Host: GitHub
URL: https://github.com/xichenpan/ARLDM
Owner: xichenpan
License: mit
Created: 2022-11-20T02:57:14.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2023-07-09T20:46:17.000Z (over 2 years ago)
Last Synced: 2024-10-30T23:35:36.585Z (about 1 year ago)
Language: Python
Homepage: https://arxiv.org/abs/2211.10950
Size: 2.6 MB
Stars: 191
Watchers: 12
Forks: 29
Open Issues: 13
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-diffusion-categorized - [Code

README

# Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/synthesizing-coherent-story-with-auto/story-visualization-on-pororo)](https://paperswithcode.com/sota/story-visualization-on-pororo?p=synthesizing-coherent-story-with-auto) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/synthesizing-coherent-story-with-auto/story-continuation-on-pororosv)](https://paperswithcode.com/sota/story-continuation-on-pororosv?p=synthesizing-coherent-story-with-auto) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/synthesizing-coherent-story-with-auto/story-continuation-on-flintstonessv)](https://paperswithcode.com/sota/story-continuation-on-flintstonessv?p=synthesizing-coherent-story-with-auto) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/synthesizing-coherent-story-with-auto/story-continuation-on-vist)](https://paperswithcode.com/sota/story-continuation-on-vist?p=synthesizing-coherent-story-with-auto)

![teaser](assets/teaser.png)

This version is immigrated from a internal implementation of Alibaba Group, feel free to open an issue to address any problem!

## Environment
```shell
conda create -n arldm python=3.8
conda activate arldm
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch-lts
git clone https://github.com/Flash-321/ARLDM.git
cd ARLDM
pip install -r requirements.txt
```
## Data Preparation
* Download the PororoSV dataset [here](https://drive.google.com/file/d/11Io1_BufAayJ1BpdxxV2uJUvCcirbrNc/view?usp=sharing).
* Download the FlintstonesSV dataset [here](https://drive.google.com/file/d/1kG4esNwabJQPWqadSDaugrlF4dRaV33_/view?usp=sharing).
* Download the VIST-SIS url links [here](https://visionandlanguage.net/VIST/json_files/story-in-sequence/SIS-with-labels.tar.gz)
* Download the VIST-DII url links [here](https://visionandlanguage.net/VIST/json_files/description-in-isolation/DII-with-labels.tar.gz)
* Download the VIST images running
```shell
python data_script/vist_img_download.py
--json_dir /path/to/dii_json_files
--img_dir /path/to/save_images
--num_process 32
```
* To accelerate I/O, using the following scrips to convert your downloaded data to HDF5
```shell
python data_script/pororo_hdf5.py
--data_dir /path/to/pororo_data
--save_path /path/to/save_hdf5_file

python data_script/flintstones_hdf5.py
--data_dir /path/to/flintstones_data
--save_path /path/to/save_hdf5_file

python data_script/vist_hdf5.py
--sis_json_dir /path/to/sis_json_files
--dii_json_dir /path/to/dii_json_files
--img_dir /path/to/vist_images
--save_path /path/to/save_hdf5_file
```

## Training
Specify your directory and device configuration in `config.yaml` and run
```shell
python main.py
```
## Sample
Specify your directory and device configuration in `config.yaml` and run
```shell
python main.py
```

## Acknowledgment
Thanks a lot to [@adymaharana](https://github.com/adymaharana) for kindly sharing FlintstonesSV and PororoSV datasets (and the code), as well as the PororoSV pretrained checkpoint and Flintstones sampled results of [StoryDALL·E](https://github.com/adymaharana/storydalle).

## Citation
If you find this code useful for your research, please cite our paper:
```bibtex
@article{pan2022synthesizing,
title={Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models},
author={Pan, Xichen and Qin, Pengda and Li, Yuhong and Xue, Hui and Chen, Wenhu},
journal={arXiv preprint arXiv:2211.10950},
year={2022}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/xichenpan/ARLDM

Awesome Lists containing this project

README