Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yanx27/clevr3d
CLEVR3D Dataset: Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation
https://github.com/yanx27/clevr3d
point-cloud scene-graph scene-understanding vqa-3d vqa-dataset
Last synced: 2 months ago
JSON representation
CLEVR3D Dataset: Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation
- Host: GitHub
- URL: https://github.com/yanx27/clevr3d
- Owner: yanx27
- Created: 2023-05-22T06:36:30.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-02-02T08:20:03.000Z (12 months ago)
- Last Synced: 2024-07-30T19:57:17.552Z (6 months ago)
- Topics: point-cloud, scene-graph, scene-understanding, vqa-3d, vqa-dataset
- Language: Python
- Homepage:
- Size: 5.22 MB
- Stars: 11
- Watchers: 2
- Forks: 1
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# CLEVR3D
**Xu Yan***, **Zhihao Yuan***, Yuhao Du, Yinghong Liao, Yao Guo, Shuguang Cui, and Zhen Li
"Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation
" [[arxiv]](https://arxiv.org/pdf/2112.11691.pdf).> Our paper is accepted by TVCG (IEEE Transactions on Visualization and Computer Graphics)
![image](img/fig1.png)
If you find our work useful in your research, please consider citing:
```latex
@article{yan2023comprehensive,
title={Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation},
author={Yan, Xu and Yuan, Zhihao and Du, Yuhao and Liao, Yinghong and Guo, Yao and Cui, Shuguang and Li, Zhen},
journal={IEEE Transactions on Visualization \& Computer Graphics},
number={01},
pages={1--13},
year={2023},
publisher={IEEE Computer Society}
}
```## Installation
### Requirements
- pytorch >= 1.8
- transformers
- [PyTorch Lightning](https://lightning.ai/docs/pytorch/stable/)## Data Preparation
The VQA3D data can be found in `data/CLEVR3D/CLEVR3D-REAL.json`. The data has the following structure:
```
{
"question":[
{
"scan": "f62fd5fd-9a3f-2f44-883a-1e5cf819608e",
"image_index": 0,
"question": "Are there the same number of sofas and wide sinks?",
"answer": "no",
"template_filename": "compare_integer.json",
"question_family_index": 0,
"question_type": "equal_integer"
},
...
]}
```
The scan number is the same as [3RScan](https://github.com/WaldJohannaU/3RScan).
Please download the preprocessed 3RScan data from [Baidu Netdisk](https://pan.baidu.com/s/1q-K79cEeHzUaBJ1ZjkNxvw) (**ifei**). And modify the data path in `lib/config.py`.## Training
```shell
cd
python main.py --log_dir {LOGNAME} --use_scene_graph --preloading
```## Evaluation
You cna download our weights from [OneDrive](https://cuhko365-my.sharepoint.com/:u:/g/personal/221019046_link_cuhk_edu_cn/EUZZSwJPTD9Btep3Z2lYa10BqxXJ4ecJydWa_pX5YQk9DQ?e=SkznPm)
```shell
python main.py --test --ckpt_path --use_scene_graph --preloading
```## Question Generation
The dataset is semi-automatic generated, where an initiating dataset is generated automatically, and some manual modification is applied.
All the files needed for question generation is in the directory of ```data_generation```.
We will generate questions, functional programs, and answers for the scenes. This step takes as input the single JSON file ``` 3dssg_scenes.json``` containing all ground-truth scene information and outputs a JSON file ``` questions.json``` containing questions, answers, and functional programs for the questions.
You can generate initiating questions like this:
```
cd question_generation
python generate_questions.py
```By default, ``` generate_questions.py``` will generate questions for all scenes in the input file. However, you can generate questions by using other flags like ```--scene_start_idx```.
You can find more details about question generation [here](data_generation/README.md).