Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dailenson/One-DM
Official Code for ECCV 2024 paper β One-Shot Diffusion Mimicker for Handwritten Text Generation
https://github.com/dailenson/One-DM
computer-vision deep-learning diffusion-models handwriting-imitator handwritten-text-generation image-generation latent-diffusion pytorch-implementation
Last synced: about 1 month ago
JSON representation
Official Code for ECCV 2024 paper β One-Shot Diffusion Mimicker for Handwritten Text Generation
- Host: GitHub
- URL: https://github.com/dailenson/One-DM
- Owner: dailenson
- License: mit
- Created: 2024-07-15T02:06:50.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-10-24T07:10:07.000Z (4 months ago)
- Last Synced: 2024-10-24T18:51:50.759Z (4 months ago)
- Topics: computer-vision, deep-learning, diffusion-models, handwriting-imitator, handwritten-text-generation, image-generation, latent-diffusion, pytorch-implementation
- Language: Python
- Homepage:
- Size: 4.83 MB
- Stars: 266
- Watchers: 8
- Forks: 23
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-diffusion-categorized - [Code
README
One-DM:One-Shot Diffusion Mimicker for Handwritten Text Generation
![]()
## π Introduction
- We propose a One-shot Diffusion Mimicker (One-DM) for stylized handwritten text generation, which only requires a single reference sample as style input, and imitates its writing style to generate handwritten text with arbitrary content.
- Previous state-of-the-art methods struggle to accurately extract a user's handwriting style from a single sample due to their limited ability to learn styles. To address this issue, we introduce the high-frequency components of the reference sample to
enhance the extraction of handwriting style. The proposed style-enhanced module can effectively capture the writing style patterns and suppress the interference of background noise.
- Extensive experiments on handwriting datasets in English, Chinese, and Japanese demonstrate that our approach with a single style reference even
outperforms previous methods with 15x-more references.![]()
Overview of the proposed One-DM## π Release
- [2024/10/24] We have provided a well-trained One-DM checkpoint on Google Drive and Baidu Drive :)
- [2024/09/16] This work is reported by [Synced](https://mp.weixin.qq.com/s/1JdBsjf0hru7iSS7jln02Q) (ζΊε¨δΉεΏ).
- [2024/09/07]π₯π₯π₯ We open-source the first version of One-DM that can generate the handwritten words. (LaterΒ versions supporting Chinese and Japanese will be released soon.)## π¨ Requirements
```
conda create -n One-DM python=3.8 -y
conda activate One-DM
# install all dependencies
conda env create -f environment.yml
```
## βοΈ Datasets
We provide English datasets in [Google Drive](https://drive.google.com/drive/folders/108TB-z2ytAZSIEzND94dyufybjpqVyn6) | [Baidu Netdisk](https://pan.baidu.com/s/14ESFRk0RaTr98eeLzcr_xw?pwd=4vsv) | [ShiZhi AI](https://wisemodel.cn/models/SCUT-MMPR/One-DM/blob/main/English_data.zip). Please download these datasets, uzip them and move the extracted files to /data.
## π³ Model Zoo| Model|Google Drive|Baidu Netdisk|ShiZhi AI|
|---------------|---------|-----------------------------------------|--------------|
|Pretrained One-DM|[Google Drive](https://drive.google.com/drive/folders/10KOQ05HeN2kaR2_OCZNl9D_Kh1p8BDaa)|[Baidu Netdisk](https://pan.baidu.com/s/1VwckEw9TN734CirfWvZgiw?pwd=pfl8)|[ShiZhi AI](https://wisemodel.cn/models/SCUT-MMPR/One-DM/blob/main/One-DM-ckpt.pt)
|Pretrained OCR model|[Google Drive](https://drive.google.com/drive/folders/10KOQ05HeN2kaR2_OCZNl9D_Kh1p8BDaa)|[Baidu Netdisk](https://pan.baidu.com/s/1VwckEw9TN734CirfWvZgiw?pwd=pfl8)|[ShiZhi AI](https://wisemodel.cn/models/SCUT-MMPR/One-DM/blob/main/vae_HTR138.pth)
|Pretrained Resnet18|[Google Drive](https://drive.google.com/drive/folders/10KOQ05HeN2kaR2_OCZNl9D_Kh1p8BDaa)|[Baidu Netdisk](https://pan.baidu.com/s/1VwckEw9TN734CirfWvZgiw?pwd=pfl8)|[ShiZhi AI](https://wisemodel.cn/models/SCUT-MMPR/One-DM/blob/main/RN18_class_10400.pth)**Note**:
Please download these weights, and move them to /model_zoo. (If you cannot access the pre-trained VAE model available on Hugging Face, please refer to the pinned issue for guidance.)
## ποΈ Training & Test
- **training on English dataset**
```Shell
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=2 train.py \
--feat_model model_zoo/RN18_class_10400.pth \
--log English
```
- **finetune on English dataset**
```Shell
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 train_finetune.py \
--one_dm ./Saved/IAM64_scratch/English-timestamp/model/epoch-ckpt.pt \
--ocr_model ./model_zoo/vae_HTR138.pth --log English
```
**Note**:
Please modify ``timestamp`` and ``epoch`` according to your own path.- **test on English dataset**
```Shell
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 test.py \
--one_dm ./Saved/IAM64_finetune/English-timestamp/model/epoch-ckpt.pt \
--generate_type oov_u --dir ./Generated/English
```
**Note**:
Please modify ``timestamp`` and ``epoch`` according to your own path.
## πΊ Exhibition
- **Comparisons with industrial image generation methods on handwritten text generation**
![]()
- **Comparisons with industrial image generation methods on Chinese handwriting generation**
![]()
- **English handwritten text generation**
![]()
- **Chinese and Japanese handwriting generation**
![]()
## β€οΈ Citation
If you find our work inspiring or use our codebase in your research, please cite our work:
```
@inproceedings{one-dm2024,
title={One-Shot Diffusion Mimicker for Handwritten Text Generation},
author={Dai, Gang and Zhang, Yifan and Ke, Quhui and Guo, Qiangya and Huang, Shuangping},
booktitle={European Conference on Computer Vision},
year={2024}
}
```## β StarGraph
[![Star History Chart](https://api.star-history.com/svg?repos=dailenson/One-DM&type=Timeline)](https://star-history.com/#dailenson/One-DM&Timeline)