https://github.com/yanx27/clevr3d

CLEVR3D Dataset: Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation
https://github.com/yanx27/clevr3d

point-cloud scene-graph scene-understanding vqa-3d vqa-dataset

Last synced: 5 months ago
JSON representation

CLEVR3D Dataset: Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation

Host: GitHub
URL: https://github.com/yanx27/clevr3d
Owner: yanx27
Created: 2023-05-22T06:36:30.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-02-02T08:20:03.000Z (about 1 year ago)
Last Synced: 2024-07-30T19:57:17.552Z (9 months ago)
Topics: point-cloud, scene-graph, scene-understanding, vqa-3d, vqa-dataset
Language: Python
Homepage:
Size: 5.22 MB
Stars: 11
Watchers: 2
Forks: 1
Open Issues: 4
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# CLEVR3D

**Xu Yan***, **Zhihao Yuan***, Yuhao Du, Yinghong Liao, Yao Guo, Shuguang Cui, and Zhen Li
"Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation
" [[arxiv]](https://arxiv.org/pdf/2112.11691.pdf).

> Our paper is accepted by TVCG (IEEE Transactions on Visualization and Computer Graphics)

![image](img/fig1.png)

If you find our work useful in your research, please consider citing:
```latex
@article{yan2023comprehensive,
title={Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation},
author={Yan, Xu and Yuan, Zhihao and Du, Yuhao and Liao, Yinghong and Guo, Yao and Cui, Shuguang and Li, Zhen},
journal={IEEE Transactions on Visualization \& Computer Graphics},
number={01},
pages={1--13},
year={2023},
publisher={IEEE Computer Society}
}
```

## Installation

### Requirements
- pytorch >= 1.8
- transformers
- [PyTorch Lightning](https://lightning.ai/docs/pytorch/stable/)

## Data Preparation
The VQA3D data can be found in `data/CLEVR3D/CLEVR3D-REAL.json`. The data has the following structure:
```
{
"question":[
{
"scan": "f62fd5fd-9a3f-2f44-883a-1e5cf819608e",
"image_index": 0,
"question": "Are there the same number of sofas and wide sinks?",
"answer": "no",
"template_filename": "compare_integer.json",
"question_family_index": 0,
"question_type": "equal_integer"
},
...
]}
```
The scan number is the same as [3RScan](https://github.com/WaldJohannaU/3RScan).
Please download the preprocessed 3RScan data from [Baidu Netdisk](https://pan.baidu.com/s/1q-K79cEeHzUaBJ1ZjkNxvw) (**ifei**). And modify the data path in `lib/config.py`.

## Training
```shell
cd
python main.py --log_dir {LOGNAME} --use_scene_graph --preloading
```

## Evaluation
You cna download our weights from [OneDrive](https://cuhko365-my.sharepoint.com/:u:/g/personal/221019046_link_cuhk_edu_cn/EUZZSwJPTD9Btep3Z2lYa10BqxXJ4ecJydWa_pX5YQk9DQ?e=SkznPm)
```shell
python main.py --test --ckpt_path --use_scene_graph --preloading
```

## Question Generation

The dataset is semi-automatic generated, where an initiating dataset is generated automatically, and some manual modification is applied.

All the files needed for question generation is in the directory of ```data_generation```.

We will generate questions, functional programs, and answers for the scenes. This step takes as input the single JSON file ``` 3dssg_scenes.json``` containing all ground-truth scene information and outputs a JSON file ``` questions.json``` containing questions, answers, and functional programs for the questions.

You can generate initiating questions like this:

```
cd question_generation
python generate_questions.py
```

By default, ``` generate_questions.py``` will generate questions for all scenes in the input file. However, you can generate questions by using other flags like ```--scene_start_idx```.

You can find more details about question generation [here](data_generation/README.md).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/yanx27/clevr3d

Awesome Lists containing this project

README