{"id":24191203,"url":"https://github.com/kingnobro/chat2svg","last_synced_at":"2025-03-03T04:13:25.147Z","repository":{"id":272105588,"uuid":"915543312","full_name":"kingnobro/Chat2SVG","owner":"kingnobro","description":"Code of \"Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models\"","archived":false,"fork":false,"pushed_at":"2025-02-24T18:48:38.000Z","size":2429,"stargazers_count":14,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-24T19:28:34.475Z","etag":null,"topics":["large-language-model","stable-diffusion","svg","svg-generation","text-to-svg"],"latest_commit_sha":null,"homepage":"https://chat2svg.github.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kingnobro.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-12T05:57:19.000Z","updated_at":"2025-02-24T18:48:42.000Z","dependencies_parsed_at":"2025-01-12T07:18:03.327Z","dependency_job_id":"d07ca02b-6784-44f2-a0f4-12f5e3b69162","html_url":"https://github.com/kingnobro/Chat2SVG","commit_stats":null,"previous_names":["kingnobro/chat2svg"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingnobro%2FChat2SVG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingnobro%2FChat2SVG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingnobro%2FChat2SVG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kingnobro%2FChat2SVG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kingnobro","download_url":"https://codeload.github.com/kingnobro/Chat2SVG/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241605819,"owners_count":19989612,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["large-language-model","stable-diffusion","svg","svg-generation","text-to-svg"],"created_at":"2025-01-13T15:17:37.973Z","updated_at":"2025-03-03T04:13:25.140Z","avatar_url":"https://github.com/kingnobro.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models\n\n[![arXiv](https://img.shields.io/badge/arXiv-2312.16476-b31b1b.svg)](https://arxiv.org/abs/2411.16602)\n[![website](https://img.shields.io/badge/Website-Gitpage-4CCD99)](https://chat2svg.github.io/)\n\n![title](./assets/teaser.png)\n\n## Overview\n\nChat2SVG is a framework for generating vector graphics using large language models and image diffusion models. The system works in multiple stages to generate, enhance, and optimize SVG from text descriptions.\n\n\n## TODO List\n- [x] SVG template generation with Large Language Models\n- [x] Detail enhancement with image diffusion models\n- [x] SVG shape optimization\n\n\n## Setup\nClone the repository:\n```shell\ngit clone git@github.com:kingnobro/Chat2SVG.git\ncd Chat2SVG\nconda create --name chat2svg python=3.10\nconda activate chat2svg\n```\n\nInstall PyTorch and other dependencies:\n```shell\nconda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1  pytorch-cuda=11.8 -c pytorch -c nvidia\npip install git+https://github.com/facebookresearch/segment-anything.git\npip install -r requirements.txt\n```\n\nInstall [diffvg](https://github.com/BachiLi/diffvg) for differentiable rendering:\n```shell\ngit clone https://github.com/BachiLi/diffvg.git\ncd diffvg\ngit submodule update --init --recursive\nconda install -y -c anaconda cmake\nconda install -y -c conda-forge ffmpeg\npip install svgwrite svgpathtools cssutils torch-tools\npython setup.py install\ncd ..\n```\n\nInstall [picosvg](https://github.com/googlefonts/picosvg) for SVG cleaning:\n```shell\ngit clone git@github.com:googlefonts/picosvg.git\ncd picosvg\npip install -e .\ncd ..\n```\n\n## Pipeline 🖌\n\n\u003e [!TIP]\n\u003e We provide two ways to generate SVG templates:\n\u003e 1. If you want to **create high-quality SVG**, we recommend checking the output of each stage to ensure the generated SVG meet \"human-preferred\" criteria.\n\u003e 2. If you want to **compare the performance** of our method with your own SVG generation method, we also provide a simple way to automatically generate all outputs.\n\n\u003e [!CAUTION]\n\u003e Hong Kong is banned by Anthropic/OpenAI. Therefore, I use a third-party API from [WildCard](https://bewildcard.com/) to forward requests to Claude. If you are in a region where you can access Anthropic/OpenAI directly, you can modify lines 64-65 in `utils/gpt.py` to use the original Anthropic API. Additional modifications may be required. Sorry for the inconvenience.\n\n## Step-By-Step Pipeline (For High-Quality SVG 🎨)\n\n\u003e We have provided some sample generation and intermediate results in the `output/example_generation` folder. You can check them to get a better understanding of the pipeline.\n\n### Stage 1: Template Generation\n\nFirst, paste your Anthropic API key into the `.env` file:\n```shell\nOPENAI_API_KEY=\u003cyour_key\u003e\n```\n\nThen, run the following command to generate SVG templates:\n```shell\ncd 1_template_generation\nbash run.sh\n```\n- The detailed prompts of each target object can be found in `utils/util.py → get_prompt()`.\n- Output files will be saved in `output/example_generation/stage_1` folder.\n- To visualize/edit the SVG results, we recommend using the [SVG](https://marketplace.visualstudio.com/items?itemName=jock.svg) and [SVG Editor](https://marketplace.visualstudio.com/items?itemName=henoc.svgeditor) plugins of VSCode.\n- Since multiple SVG templates are generated, we use [ImageReward](https://github.com/THUDM/ImageReward) or [CLIP](https://github.com/openai/CLIP) to select the best one for the next stage. You can also manually select the best SVG template based on your own preference.\n- Finally, there should be a `target_template.svg` (e.g., `apple_template.svg`) file in the root directory.\n\n\u003e [!TIP]\n\u003e Our visual rectification process can solve common issues in SVG. However, we've observed that in some cases, VLM may actually degrade the quality of the SVG during rectification. We recommend double-checking the output before and after rectification to ensure the best results.\n\n### Stage 2: Detail Enhancement\n\n```shell\ncd 2_detail_enhancement\nbash download_models.sh  # download pretrained model weights\nbash run.sh              # detail enhancement\n```\n\nThe above command will:\n- clean SVG templates using picosvg (convert shapes to cubic Bézier curves), output `apple_clean.svg`\n- generate target images using [SDXL](https://civitai.com/models/269232/aam-xl-anime-mix) and [ControlNet](https://huggingface.co/xinsir/controlnet-tile-sdxl-1.0), output `apple_target.png`\n- use [Segment Anything Model (SAM)](https://github.com/facebookresearch/segment-anything) to add new shapes, output `apple_with_new_shape.svg`\n\n\u003e [!TIP]\n\u003e 1. Adjust the `strength` to control the strength of the SDEdit (Image to Image). We recommend `0.75` for mild enhancement and `1.0` for strong enhancement.\n\u003e 2. The default number of generated target images is `4`, and we select the **first one** as the default target image. You can check all generated images to select your preferred one.\n\u003e 3. Adjust `points_per_side` in SAM to control the granularity of the added shapes, and adjust `thresh_iou` to control the threshold that determines whether a shape is a new shape or not.\n\u003e 4. As mentioned in the paper's limitation section, SAM sometimes may not add appropriate shapes. Please check the output and modify if necessary.\n\n\n### Stage 3: SVG Shape Optimization\n```shell\ncd 3_svg_optimization\nbash download_models.sh  # download pretrained SVG VAE model\nbash run.sh              # optimize SVG shapes (GPU consumption: less than 4GB)\n```\n\n\u003e [!TIP]\n\u003e 1. We turn off `enable_path_iou_loss` by default, which can greatly improve time efficiency. To avoid path semantic meaning shifts, you can set it to `True`.\n\u003e 2. We proportionally scale up the loss weights (different from the paper) to ensure faster convergence.\n\u003e 3. Results: `apple_optim_latent.svg` and `apple_optim_point.svg`\n\n## Automated Pipeline (For Comparison ⚖️)\nCode coming soon. Alternatively, you can enter each folder and run the `run.sh` script to generate all outputs.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingnobro%2Fchat2svg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkingnobro%2Fchat2svg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingnobro%2Fchat2svg/lists"}