An open API service indexing awesome lists of open source software.

https://nextplusplus.github.io/TAT-DQA/

TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning
https://nextplusplus.github.io/TAT-DQA/

document-understanding question-answering vqa

Last synced: 2 months ago
JSON representation

TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning

Awesome Lists containing this project

README

        

TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning
====================

**TAT-DQA** is a large-scale Document VQA dataset, which is constructed by extending the [TAT-QA](https://github.com/NExTplusplus/TAT-QA). It aims to stimulate progress of QA research over more complex and realistic visually-rich documents with rich tabular and textual content, especially those requiring numerical reasoning.

You can download our TAT-DQA dataset via [TAT-DQA Dataset](https://drive.google.com/drive/folders/1SGpZyRWqycMd_dZim1ygvWhl5KdJYDR2).

For more information, please refer to our [TAT-DQA Website](https://nextplusplus.github.io/TAT-DQA/) or read our ACM MM 2022 paper [PDF](https://arxiv.org/pdf/2207.11871.pdf).

### Updates

**${\color{red}Jan 2024}$**: We release the ground truth for the TAT-DQA test set [TAT-DQA Dataset](https://drive.google.com/drive/folders/1SGpZyRWqycMd_dZim1ygvWhl5KdJYDR2), to facilitate future research on this task!

**${\color{red}May 2023}$**: Source Code released! You are welcome to use the [Doc2SoarGraph repo](https://github.com/fengbinzhu/Doc2SoarGraph) to explore the TAT-DQA dataset and start your research!

### Citation

__Please kindly cite our work if you use our dataset or codes, thank you.__
```bash

@inproceedings{zhu2022towards,
title={Towards complex document understanding by discrete reasoning},
author={Zhu, Fengbin and Lei, Wenqiang and Feng, Fuli and Wang, Chao and Zhang, Haozhou and Chua, Tat-Seng},
booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
pages={4857--4866},
year={2022}
}

@inproceedings{zhu2024doc2soargraph,
title = "{D}oc2{S}oar{G}raph: Discrete Reasoning over Visually-Rich Table-Text Documents via Semantic-Oriented Hierarchical Graphs",
author = "Zhu, Fengbin and
Wang, Chao and
Feng, Fuli and
Ren, Zifeng and
Li, Moxin and
Chua, Tat-Seng",
editor = "Calzolari, Nicoletta and
Kan, Min-Yen and
Hoste, Veronique and
Lenci, Alessandro and
Sakti, Sakriani and
Xue, Nianwen",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
year = "2024",
address = "Torino, Italia",
publisher = "ELRA and ICCL",
url = "https://aclanthology.org/2024.lrec-main.456",
pages = "5119--5131"
}
```

### License

The TAT-DQA dataset is under the license of [Creative Commons (CC BY) Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

### Any Questions?

For any issues please create an issue [here](https://github.com/nextplusplus/TAT-DQA/issues) or kindly email us at:
Fengbin Zhu [[email protected]](mailto:[email protected]), thank you.