https://github.com/zjunlp/knowledge2data

Spatial Knowledge Graph-Guided Synthesis for Multimodal LLMs
https://github.com/zjunlp/knowledge2data

generation kg2data knowledge-graph large-language-models multimodal-large-language-models natural-language-processing spatial-knowledge-graph synthetic-data visual-question-answering

Last synced: 3 months ago
JSON representation

Spatial Knowledge Graph-Guided Synthesis for Multimodal LLMs

Host: GitHub
URL: https://github.com/zjunlp/knowledge2data
Owner: zjunlp
License: mit
Created: 2024-12-11T14:34:00.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-06-03T12:50:47.000Z (about 1 year ago)
Last Synced: 2025-06-13T23:04:51.985Z (about 1 year ago)
Topics: generation, kg2data, knowledge-graph, large-language-models, multimodal-large-language-models, natural-language-processing, spatial-knowledge-graph, synthetic-data, visual-question-answering
Language: Python
Homepage:
Size: 1.51 MB
Stars: 2
Watchers: 4
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          


 👉 Knowledge2Data 👈 

Spatial Knowledge Graph-Guided Multimodal Synthesis

[![Awesome](https://awesome.re/badge.svg)](https://github.com/zjunlp/Knowledge2Data) 

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)









  Project •

  Paper •

  HuggingFace •

  Overview •

  Quickstart •

  Citation





## Table of Contents

- What's New •

- Overview •

- Quickstart •

- Citation

## 🔔News

- **2025-11-01, Our paper has been ACCEPTED for publication as a REGULAR paper in the IEEE TASLP(Transactions on Audio, Speech and Language Processing).**

- **2025-02-28, We release the paper.**

---

## 🌟Overview







## ⏩Quickstart

### Data

Get training data and test data from HuggingFace: https://huggingface.co/datasets/zjunlp/Knowledge2Data

### Installation

```

git clone https://github.com/zjunlp/Knowledge2Data

cd Knowledge2Data

conda create -n skg python==3.9

conda activate skg

pip install -r requirements.txt

```

### Download the models

#### Download the following models from HuggingFace

| 🎯 Model Name                 | 🤗 HuggingFace                                                            |

|-------------------------------|---------------------------------------------------------------------------|

| Diffusers-generation-text-box | [gligen/diffusers-generation-text-box](https://huggingface.co/gligen/diffusers-generation-text-box) |

| Sam-vit-base                  | [stabilityai/stable-diffusion-xl-refiner-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0)       |

| Stable-diffusion-xl-refiner   | [facebook/sam-vit-base](https://huggingface.co/facebook/sam-vit-base)      |

### Export the environment variables.

```shell

cd src

export OPENAI_API_KEY="YOUR_API_KEY"

export SKG_HF_MODELS="LOCAL_HUGGINGFACE_MODELS_DIR"

```

### Generate Spatial KG and multimodal synthetic data.

#### Execute script to generate Spatial KG.

```shell

sh run_skg.sh

```

You can also customize objects and their spatial relationships to form Spatial KG. Save the file format as a JSON file similar to "src/data/skg_demo.json".

#### Execute script to multimodal synthetic data.

```shell

sh run_data.sh

```

For custom data, only the input file parameters "--input_file" need to be modified.

You can find generated data in "src/data" and images in "src/img_generations" as default.

If you want to generate more data, you can modify the parameters including "--num_scenes" ([generate_scenes.py](src%2Fgenerate_scenes.py)) and "--repeats" ([generate_images.py](src%2Fgenerate_images.py)).

## 🌻Acknowledgement

This project is based on open-source projects including [LLM-groundedDiffusion](https://github.com/TonyLianLong/LLM-groundedDiffusion). Thanks for their great contributions!

### 🚩Citation

Please cite the following paper if you use this project in your work.

```bibtex

@misc{xue2025spatialknowledgegraphguidedmultimodal,

      title={Spatial Knowledge Graph-Guided Multimodal Synthesis}, 

      author={Yida Xue and Zhen Bi and Jinnan Yang and Jungang Lou and Huajun Chen and Ningyu Zhang},

      year={2025},

      eprint={2505.22633},

      archivePrefix={arXiv},

      primaryClass={cs.CL},

      url={https://arxiv.org/abs/2505.22633}, 

}

```

---

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zjunlp/knowledge2data

Awesome Lists containing this project

README

👉 Knowledge2Data 👈