{"id":14652940,"url":"https://github.com/dailenson/One-DM","last_synced_at":"2025-09-08T08:32:26.773Z","repository":{"id":248452873,"uuid":"828725594","full_name":"dailenson/One-DM","owner":"dailenson","description":" Official Code for ECCV 2024 paper — One-Shot Diffusion Mimicker for Handwritten Text Generation","archived":false,"fork":false,"pushed_at":"2024-10-24T07:10:07.000Z","size":5065,"stargazers_count":266,"open_issues_count":15,"forks_count":23,"subscribers_count":8,"default_branch":"main","last_synced_at":"2024-10-24T18:51:50.759Z","etag":null,"topics":["computer-vision","deep-learning","diffusion-models","handwriting-imitator","handwritten-text-generation","image-generation","latent-diffusion","pytorch-implementation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dailenson.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-15T02:06:50.000Z","updated_at":"2024-10-24T13:08:57.000Z","dependencies_parsed_at":null,"dependency_job_id":"8d9559d1-f592-447d-8fea-99e5e7b70a49","html_url":"https://github.com/dailenson/One-DM","commit_stats":null,"previous_names":["dailenson/one-dm"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dailenson%2FOne-DM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dailenson%2FOne-DM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dailenson%2FOne-DM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dailenson%2FOne-DM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dailenson","download_url":"https://codeload.github.com/dailenson/One-DM/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":232296412,"owners_count":18501330,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","deep-learning","diffusion-models","handwriting-imitator","handwritten-text-generation","image-generation","latent-diffusion","pytorch-implementation"],"created_at":"2024-09-11T07:01:18.399Z","updated_at":"2025-09-08T08:32:26.747Z","avatar_url":"https://github.com/dailenson.png","language":"Python","readme":"\n\u003ch2 align=\"center\"\u003e\u003ca href=\"\"\u003eOne-DM:One-Shot Diffusion Mimicker for Handwritten Text Generation\u003c/a\u003e\u003c/h2\u003e\n\u003cdiv align=\"center\"\u003e\n  \u003ca href=\"https://arxiv.org/abs/2409.04004\"\u003e\u003cimg src=\"https://img.shields.io/badge/Arxiv-2409.04004-red\"\u003e\u003c/a\u003e\n  \u003ca href=\"\"\u003e\u003cimg src=\"https://img.shields.io/badge/Pytorch-1.13-green\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/dailenson/One-DM/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/License-MIT-blue\"\u003e\u003c/a\u003e\n\u003c/div\u003e\n\u003cbr\u003e\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/js79ccvr33.png\" style=\"width: 200px; height: 200px; margin: 0 auto;\"\u003e\n\u003c/p\u003e\n\u003c!-- \u003ca href=\"https://github.com/Ucas-HaoranWei/GOT-OCR2.0/\"\u003e\u003cimg src=\"https://img.shields.io/badge/Project-Page-Green\"\u003e\u003c/a\u003e --\u003e\n\n\u003c!-- \u003ca href=\"https://github.com/Ucas-HaoranWei/GOT-OCR2.0/blob/main/assets/wechat.jpg\"\u003e\u003cimg src=\"https://img.shields.io/badge/Wechat-blue\"\u003e\u003c/a\u003e \n\u003ca href=\"https://zhuanlan.zhihu.com/p/718163422\"\u003e\u003cimg src=\"https://img.shields.io/badge/zhihu-red\"\u003e\u003c/a\u003e  --\u003e\n\n\u003c!-- [Gang Dai](https://scholar.google.com/citations?user=J4naK0MAAAAJ\u0026hl=en), Yifan Zhang, Quhui Ke, Qiangya Guo, Lingyu Kong, Yanming Xu,  [Zheng Ge](https://joker316701882.github.io/), Liang Zhao, [Jianjian Sun](https://scholar.google.com/citations?user=MVZrGkYAAAAJ\u0026hl=en), [Yuang Peng](https://scholar.google.com.hk/citations?user=J0ko04IAAAAJ\u0026hl=zh-CN\u0026oi=ao), Chunrui Han, [Xiangyu Zhang](https://scholar.google.com/citations?user=yuB-cfoAAAAJ\u0026hl=en) --\u003e\n\n\u003c!-- \u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/got_logo.png\" style=\"width: 200px\" align=center\u003e\n\u003c/p\u003e --\u003e\n\n## 🌟 Introduction\n- We propose a One-shot Diffusion Mimicker (One-DM) for stylized handwritten text generation, which only requires a single reference sample as style input, and imitates its writing style to generate handwritten text with arbitrary content.\n- Previous state-of-the-art methods struggle to accurately extract a user's handwriting style from a single sample due to their limited ability to learn styles. To address this issue, we introduce the high-frequency components of the reference sample to\n enhance the extraction of handwriting style. The proposed style-enhanced module can effectively capture the writing style patterns and suppress the interference of background noise.\n- Extensive experiments on handwriting datasets in English, Chinese, and Japanese demonstrate that our approach with a single style reference even\noutperforms previous methods with 15x-more references.\n\u003cdiv style=\"display: flex; flex-direction: column; align-items: center; \"\u003e\n\u003cimg src=\"assets/overview_v2.png\" style=\"width: 100%;\"\u003e\n\u003c/div\u003e\n\u003cp align=\"center\" style=\"margin-bottom: 10px;\"\u003e\nOverview of the proposed One-DM\n\u003c/p\u003e\n\n## 🌠 News\n- [2025/06/26] 🔥🔥🔥 [DiffBrush](https://github.com/dailenson/DiffBrush), a novel state-of-the-art approach for full-line text generation, is accepted to ICCV 2025.\n- [2024/10/24] We have provided a well-trained One-DM checkpoint on Google Drive and Baidu Drive :)\n- [2024/09/16] This work is reported by [Synced](https://mp.weixin.qq.com/s/1JdBsjf0hru7iSS7jln02Q) (机器之心).\n- [2024/09/07]🔥🔥🔥 We open-source the first version of One-DM that can generate handwritten words. (Later versions supporting Chinese and Japanese will be released soon.)\n\n\n## 🔨 Requirements\n```\nconda create -n One-DM python=3.8 -y\nconda activate One-DM\n# install all dependencies\nconda env create -f environment.yml\n```\n## ☀️ Datasets\nWe provide English datasets in [Google Drive](https://drive.google.com/drive/folders/108TB-z2ytAZSIEzND94dyufybjpqVyn6) | [Baidu Netdisk](https://pan.baidu.com/s/14ESFRk0RaTr98eeLzcr_xw?pwd=4vsv) | [ShiZhi AI](https://wisemodel.cn/models/SCUT-MMPR/One-DM/blob/main/English_data.zip). Please download these datasets, uzip them and move the extracted files to /data.\n## 🐳 Model Zoo\n\n\n| Model|Google Drive|Baidu Netdisk|ShiZhi AI|\n|---------------|---------|-----------------------------------------|--------------|\n|Pretrained One-DM|[Google Drive](https://drive.google.com/drive/folders/10KOQ05HeN2kaR2_OCZNl9D_Kh1p8BDaa)|[Baidu Netdisk](https://pan.baidu.com/s/1VwckEw9TN734CirfWvZgiw?pwd=pfl8)|[ShiZhi AI](https://wisemodel.cn/models/SCUT-MMPR/One-DM/blob/main/One-DM-ckpt.pt)\n|Pretrained OCR model|[Google Drive](https://drive.google.com/drive/folders/10KOQ05HeN2kaR2_OCZNl9D_Kh1p8BDaa)|[Baidu Netdisk](https://pan.baidu.com/s/1VwckEw9TN734CirfWvZgiw?pwd=pfl8)|[ShiZhi AI](https://wisemodel.cn/models/SCUT-MMPR/One-DM/blob/main/vae_HTR138.pth)\n|Pretrained Resnet18|[Google Drive](https://drive.google.com/drive/folders/10KOQ05HeN2kaR2_OCZNl9D_Kh1p8BDaa)|[Baidu Netdisk](https://pan.baidu.com/s/1VwckEw9TN734CirfWvZgiw?pwd=pfl8)|[ShiZhi AI](https://wisemodel.cn/models/SCUT-MMPR/One-DM/blob/main/RN18_class_10400.pth)\n\n**Note**:\nPlease download these weights, and move them to /model_zoo. (If you cannot access the pre-trained VAE model available on Hugging Face, please refer to the pinned issue for guidance.)\n## 🏋️ Training \u0026 Test\n- **training on English dataset**\n```Shell\nCUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=2 train.py \\\n    --feat_model model_zoo/RN18_class_10400.pth \\\n    --log English\n```\n- **finetune on English dataset**\n```Shell\nCUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 train_finetune.py \\\n    --one_dm ./Saved/IAM64_scratch/English-timestamp/model/epoch-ckpt.pt \\\n    --ocr_model ./model_zoo/vae_HTR138.pth --log English\n ```\n**Note**:\nPlease modify ``timestamp`` and ``epoch`` according to your own path.\n\n- **test on English dataset**\n ```Shell\nCUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 test.py \\\n    --one_dm ./Saved/IAM64_finetune/English-timestamp/model/epoch-ckpt.pt \\\n    --generate_type oov_u --dir ./Generated/English\n```\n**Note**:\nPlease modify ``timestamp`` and ``epoch`` according to your own path.\n## 📺 Exhibition\n- **Comparisons with industrial image generation methods on handwritten text generation**\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/indus-English_v2.png\" style=\"width: 90%\" align=center\u003e\n\u003c/p\u003e\n\n- **Comparisons with industrial image generation methods on Chinese handwriting generation**\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/indus-Chinese.png\" style=\"width: 90%\" align=center\u003e\n\u003c/p\u003e\n\n- **English handwritten text generation**\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/One-DM_result.png\" style=\"width: 100%\" align=center\u003e\n\u003c/p\u003e\n\u003c!-- ![online English](assets/One-DM_result.png) --\u003e\n\n- **Chinese and Japanese handwriting generation**\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/casia_v4.png\" style=\"width: 90%\" align=center\u003e\n\u003c/p\u003e\n\u003c!-- ![offline Chinese](assets/casia_v4.png) --\u003e\n\n\n## ❤️ Citation\nIf you find our work inspiring or use our codebase in your research, please cite our work:\n```\n@inproceedings{one-dm2024,\n  title={One-Shot Diffusion Mimicker for Handwritten Text Generation},\n  author={Dai, Gang and Zhang, Yifan and Ke, Quhui and Guo, Qiangya and Huang, Shuangping},\n  booktitle={European Conference on Computer Vision},\n  year={2024}\n}\n```\n\n## ⭐ StarGraph\n[![Star History Chart](https://api.star-history.com/svg?repos=dailenson/One-DM\u0026type=Timeline)](https://star-history.com/#dailenson/One-DM\u0026Timeline)\n","funding_links":[],"categories":["Python","Text Generation"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdailenson%2FOne-DM","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdailenson%2FOne-DM","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdailenson%2FOne-DM/lists"}