{"id":23441652,"url":"https://github.com/blurgyy/compass","last_synced_at":"2026-01-22T02:02:51.811Z","repository":{"id":268716843,"uuid":"901833176","full_name":"blurgyy/CoMPaSS","owner":"blurgyy","description":"[ICCV 2025] Enhancing spatial understanding in text-to-Image diffusion models","archived":false,"fork":false,"pushed_at":"2025-09-01T03:43:26.000Z","size":738,"stargazers_count":83,"open_issues_count":3,"forks_count":6,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-09-09T17:39:15.762Z","etag":null,"topics":["diffusion","generation","spatial-understanding","t2i","text-to-image"],"latest_commit_sha":null,"homepage":"https://compass.blurgy.xyz","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/blurgyy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-12-11T11:58:39.000Z","updated_at":"2025-09-09T12:48:28.000Z","dependencies_parsed_at":"2024-12-18T14:23:27.934Z","dependency_job_id":"66f035c6-212d-4678-a880-5d13ac0a25c9","html_url":"https://github.com/blurgyy/CoMPaSS","commit_stats":null,"previous_names":["blurgyy/compass"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/blurgyy/CoMPaSS","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blurgyy%2FCoMPaSS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blurgyy%2FCoMPaSS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blurgyy%2FCoMPaSS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blurgyy%2FCoMPaSS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/blurgyy","download_url":"https://codeload.github.com/blurgyy/CoMPaSS/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blurgyy%2FCoMPaSS/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28650573,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-22T01:17:37.254Z","status":"online","status_checked_at":"2026-01-22T02:00:07.137Z","response_time":144,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["diffusion","generation","spatial-understanding","t2i","text-to-image"],"created_at":"2024-12-23T17:17:43.853Z","updated_at":"2026-01-22T02:02:51.806Z","avatar_url":"https://github.com/blurgyy.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models\n\n**\\[[Project Page]\\]\n\\[[arXiv]\\]\n\\[[ComfyUI node]\\]**\n\n\u003e [Gaoyang Zhang], Bingtao Fu, [Qingnan Fan], [Qi Zhang], Runxing Liu, Hong Gu, Huaqi Zhang, Xinguo Liu  \n\u003e ICCV 2025\n\n## TL; DR\n\nCoMPaSS enhances the spatial understanding of existing text-to-image diffusion models, enabling\nthem to generate images that faithfully reflect spatial configurations specified in the\ntext prompt.\n\n![teaser](./assets/teaser.avif)\n\n## Setting up Environment\n\nWe manage our python environment with [uv], and provide a convenient script for setting\nup the environment at [setup_env.sh](./setup_env.sh).\nRunning this script will create a subdirectory `.venv/` in the project root.  To enable\nit, run `source .venv/bin/activate` after the environment is set up:\n\n```bash\n# install requirements into .venv/\nbash ./setup_env.sh\n\n# activate the environment\nsource .venv/bin/activate\n```\n\n## Trying out CoMPaSS\n\n\u003e [!NOTE]\n\u003e For training, SCOP and TENOR are both required.  \n\u003e For generating images from text, only TENOR and the reference weights are needed.\n\n### ComfyUI\n\nWe recommend trying out the FLUX.1-dev LoRA trained via CoMPaSS. Please refer to [the\ncustom node's repository][ComfyUI node] to get started.\n\n### Reference Weights\n\nWe provide the reference weights used to report all metrics in our paper on Hugging\nFace 🤗.\nWe recommend trying out the FLUX.1-dev weights as it is a Rank-16 LoRA which is only\n50MB in size.\n\n| Model | Link |\n|:-----:|:-----:|\n| FLUX.1-dev | \u003chttps://huggingface.co/blurgy/CoMPaSS-FLUX.1\u003e |\n| SD1.4 | \u003chttps://huggingface.co/blurgy/CoMPaSS-SD1.4\u003e |\n| SD1.5 | \u003chttps://huggingface.co/blurgy/CoMPaSS-SD1.5\u003e |\n| SD2.1 | \u003chttps://huggingface.co/blurgy/CoMPaSS-SD2.1\u003e |\n\n### The SCOP dataset\n\nWe provide full instructions for replicating the SCOP dataset (28,028 object pairs among\n15,426 images) in the [SCOP](./SCOP) directory.  Check out its [README](./SCOP/README.md)\nto get started.\n\n### The TENOR Module\n\nWe provide both training and inference instructions for using our TENOR module in the\n[TENOR](./TENOR) directory.\nMMDiT-based models (e.g., FLUX.1-dev) and UNet-based models (e.g., SD1.5) are both\nsupported.  Check out their respective instructions to get started:\n- [Instructions for FLUX.1-dev](./TENOR/flux/README.md)\n- [Instructions for SD1.4, SD1.5, and SD2.1](./TENOR/sd/README.md)\n\n## Citation\n\n```bibtex\n@inproceedings{zhang2025compass,\n  title={CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models},\n  author={Zhang, Gaoyang and Fu, Bingtao and Fan, Qingnan and Zhang, Qi and Liu, Runxing and Gu, Hong and Zhang, Huaqi and Liu, Xinguo},\n  booktitle={ICCV},\n  year={2025}\n}\n```\n\n[Gaoyang Zhang]: \u003chttps://github.com/blurgyy\u003e\n[Qingnan Fan]: \u003chttps://fqnchina.github.io\u003e\n[Qi Zhang]: \u003chttps://qzhang-cv.github.io\u003e\n\n[Project Page]: \u003chttps://compass.blurgy.xyz\u003e\n[arXiv]: \u003chttps://arxiv.org/abs/2412.13195\u003e\n[ComfyUI node]: \u003chttps://github.com/blurgyy/CoMPaSS-FLUX.1-dev-ComfyUI\u003e\n\n[uv]: \u003chttps://github.com/astral-sh/uv\u003e\n\n[TokenCompose]: \u003chttps://github.com/mlpc-ucsd/TokenCompose\u003e\n[x-flux]: \u003chttps://github.com/XLabs-AI/x-flux\u003e\n\n\u003c!-- vim: set ts=2 sts=2 sw=2 et: --\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblurgyy%2Fcompass","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fblurgyy%2Fcompass","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblurgyy%2Fcompass/lists"}