{"id":23086436,"url":"https://github.com/dongzhuoyao/flowseq","last_synced_at":"2025-04-03T16:16:59.577Z","repository":{"id":231683734,"uuid":"658298255","full_name":"dongzhuoyao/flowseq","owner":"dongzhuoyao","description":"An official pytorch implementation of EACL2024 short paper \"Flow Matching for Conditional Text Generation in a Few Sampling Steps\"","archived":false,"fork":false,"pushed_at":"2024-04-05T09:35:06.000Z","size":1104,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-04-05T10:35:51.083Z","etag":null,"topics":["diffusion-model","flow","flow-matching"],"latest_commit_sha":null,"homepage":"https://taohu.me/project_flowseq","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dongzhuoyao.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-06-25T10:51:23.000Z","updated_at":"2024-04-05T10:35:52.171Z","dependencies_parsed_at":"2024-04-05T10:47:17.647Z","dependency_job_id":null,"html_url":"https://github.com/dongzhuoyao/flowseq","commit_stats":null,"previous_names":["dongzhuoyao/flowseq"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dongzhuoyao%2Fflowseq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dongzhuoyao%2Fflowseq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dongzhuoyao%2Fflowseq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dongzhuoyao%2Fflowseq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dongzhuoyao","download_url":"https://codeload.github.com/dongzhuoyao/flowseq/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247033816,"owners_count":20872532,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["diffusion-model","flow","flow-matching"],"created_at":"2024-12-16T18:53:55.675Z","updated_at":"2025-04-03T16:16:59.551Z","avatar_url":"https://github.com/dongzhuoyao.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"#  Flow Matching for Conditional Text Generation in a Few Sampling Steps ( EACL 2024 )\n\n\n \u003cspan class=\"author-block\"\u003e\n\u003ca href=\"https://taohu.me/\" target=\"_blank\"\u003eVincent Tao Hu,\u003c/a\u003e\u003c/span\u003e\n\u003cspan class=\"author-block\"\u003e\n\u003ca href=\"https://moore3930.github.io/\" target=\"_blank\"\u003eDi Wu,\u003c/a\u003e\u003c/span\u003e\n\u003cspan class=\"author-block\"\u003e\n  \u003ca href=\"https://yukimasano.github.io/\" target=\"_blank\"\u003eYuki M. Asano,\u003c/a\u003e\n\u003c/span\u003e\n\u003cspan class=\"author-block\"\u003e\n  \u003ca href=\"https://staff.fnwi.uva.nl/p.s.m.mettes/\" target=\"_blank\"\u003ePascal Mettes,\u003c/a\u003e\n\u003c/span\u003e\n\u003cspan class=\"author-block\"\u003e\n  \u003ca href=\"https://basurafernando.github.io/\" target=\"_blank\"\u003eBasura Fernando,\u003c/a\u003e\n\u003c/span\u003e\n\u003cspan class=\"author-block\"\u003e\n  \u003ca href=\"https://scholar.google.de/citations?user=zWbvIUcAAAAJ\u0026hl=en\" target=\"_blank\"\u003e Bjorn Ommer, \u003c/a\u003e\n\u003c/span\u003e\n\n\u003cspan class=\"author-block\"\u003e\n  \u003ca href=\"https://www.ceessnoek.info/\" target=\"_blank\"\u003eCees G.M. Snoek\u003c/a\u003e\n\u003c/span\u003e\n\n\n\nThis repository represents the official implementation of the EACL2024 paper titled \"Flow Matching for Conditional Text Generation in a Few Sampling Steps\".\n\n[![Hugging Face Model](https://img.shields.io/badge/🤗%20Hugging%20Face-Model-green)](https://huggingface.co/taohu/flowsea)\n[![Website](doc/badges/badge-website.svg)](https://taohu.me/project_flowseq)\n[![Paper](https://img.shields.io/badge/arXiv-PDF-b31b1b)](https://aclanthology.org/2024.eacl-short.33.pdf)\n[![GitHub](https://img.shields.io/github/stars/dongzhuoyao/flowseq?style=social)](https://github.com/dongzhuoyao/flowseq)\n[![License](https://img.shields.io/badge/License-Apache--2.0-929292)](https://www.apache.org/licenses/LICENSE-2.0)\n\n\n![landscape](doc/method_eacl24.png)\n\n\n# Dataset\n\n```\nhttps://drive.google.com/drive/folders/1sU8CcOJE_aaaKLijBNzr-4y1i1YoZet2?usp=drive_link\n```\n\n## Run\n\n```bash\nCUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nnodes=1 --nproc-per-node=4 flow_train.py \nCUDA_VISIBLE_DEVICES=0,2 torchrun --nnodes=1 --nproc-per-node=2 flow_train.py  data=qg\nCUDA_VISIBLE_DEVICES=6 torchrun --nnodes=1 --nproc-per-node=1 flow_train.py  data=qg\n```\n\n\n# Evaluation \n\nDownload the pretrained checkpoint from [https://huggingface.co/taohu/flowseq/tree/main](https://huggingface.co/taohu/flowseq/tree/main), more checkpoints are coming soon.\n\n\n\n```python\npython flow_sample_eval_s2s.py    data=qqp_acc data.eval.is_debug=0 data.eval.model_path='qqp_ema_0.9999_070000.pt' data.eval.candidate_num=1 data.eval.ode_stepnum=1\n```\n\n\n# Environment Preparation\n\n```bash\nconda create -n flowseq  python=3.10\nconda install -c \"nvidia/label/cuda-11.8.0\" cuda-toolkit\nconda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia\npip install  torchdiffeq  matplotlib h5py  accelerate loguru blobfile ml_collections\npip install hydra-core wandb einops scikit-learn --upgrade\npip install einops \npip install transformers\npip install nltk bert_score datasets torchmetrics\n```\n\nOptional\n```bash\npip install diffusers\n```\n\n\n\n# Common Issue\n\n\n- **The inference result on non-single steps**\n  \n  Our work is main about the explore the single-step sampling. The anchor loss is encouraged to infer the original dataset by single step, the multiple step was implemented by using a zigzag manner following the Consistency Models, this codebase doesn't include that implementation yet. \n\n- **Batch size**\n\n  If your GPU is not rich enough, try to decrease the batch size to 128~256, and stop using the accumulate gradients, this can somehow reach fair performance according my experience.\n  \n\n- **Typical Issue**\n  \n```bash \nhttps://github.com/Shark-NLP/DiffuSeq/issues/5\nhttps://github.com/Shark-NLP/DiffuSeq/issues/22\n```\n\n\n## Citation\nPlease add the citation if our paper or code helps you.\n\n```\n@inproceedings{HuEACL2024,\n        title = {Flow Matching for Conditional Text Generation in a Few Sampling Steps},\n        author = {Vincent Tao Hu and Di Wu and Yuki M Asano and Pascal Mettes and Basura Fernando and Björn Ommer and Cees G M Snoek},\n        year = {2024},\n        date = {2024-03-27},\n        booktitle = {EACL},\n        tppubtype = {inproceedings}\n        }\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdongzhuoyao%2Fflowseq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdongzhuoyao%2Fflowseq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdongzhuoyao%2Fflowseq/lists"}