https://nextplusplus.github.io/TAT-DQA/
TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning
https://nextplusplus.github.io/TAT-DQA/
document-understanding question-answering vqa
Last synced: 2 months ago
JSON representation
TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning
- Host: GitHub
- URL: https://nextplusplus.github.io/TAT-DQA/
- Owner: NExTplusplus
- Created: 2022-08-17T14:28:04.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2024-09-17T04:25:26.000Z (7 months ago)
- Last Synced: 2024-10-12T11:03:40.633Z (6 months ago)
- Topics: document-understanding, question-answering, vqa
- Homepage:
- Size: 1.01 MB
- Stars: 19
- Watchers: 5
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- Awesome-LLM - TAT-DQA - a large-scale Document Visual Question Answering (VQA) dataset designed for complex document understanding, particularly in financial reports. (LLM Leaderboard)
README
TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning
====================**TAT-DQA** is a large-scale Document VQA dataset, which is constructed by extending the [TAT-QA](https://github.com/NExTplusplus/TAT-QA). It aims to stimulate progress of QA research over more complex and realistic visually-rich documents with rich tabular and textual content, especially those requiring numerical reasoning.
You can download our TAT-DQA dataset via [TAT-DQA Dataset](https://drive.google.com/drive/folders/1SGpZyRWqycMd_dZim1ygvWhl5KdJYDR2).
For more information, please refer to our [TAT-DQA Website](https://nextplusplus.github.io/TAT-DQA/) or read our ACM MM 2022 paper [PDF](https://arxiv.org/pdf/2207.11871.pdf).### Updates
**${\color{red}Jan 2024}$**: We release the ground truth for the TAT-DQA test set [TAT-DQA Dataset](https://drive.google.com/drive/folders/1SGpZyRWqycMd_dZim1ygvWhl5KdJYDR2), to facilitate future research on this task!
**${\color{red}May 2023}$**: Source Code released! You are welcome to use the [Doc2SoarGraph repo](https://github.com/fengbinzhu/Doc2SoarGraph) to explore the TAT-DQA dataset and start your research!
### Citation
__Please kindly cite our work if you use our dataset or codes, thank you.__
```bash@inproceedings{zhu2022towards,
title={Towards complex document understanding by discrete reasoning},
author={Zhu, Fengbin and Lei, Wenqiang and Feng, Fuli and Wang, Chao and Zhang, Haozhou and Chua, Tat-Seng},
booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
pages={4857--4866},
year={2022}
}@inproceedings{zhu2024doc2soargraph,
title = "{D}oc2{S}oar{G}raph: Discrete Reasoning over Visually-Rich Table-Text Documents via Semantic-Oriented Hierarchical Graphs",
author = "Zhu, Fengbin and
Wang, Chao and
Feng, Fuli and
Ren, Zifeng and
Li, Moxin and
Chua, Tat-Seng",
editor = "Calzolari, Nicoletta and
Kan, Min-Yen and
Hoste, Veronique and
Lenci, Alessandro and
Sakti, Sakriani and
Xue, Nianwen",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
year = "2024",
address = "Torino, Italia",
publisher = "ELRA and ICCL",
url = "https://aclanthology.org/2024.lrec-main.456",
pages = "5119--5131"
}
```### License
The TAT-DQA dataset is under the license of [Creative Commons (CC BY) Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)
### Any Questions?
For any issues please create an issue [here](https://github.com/nextplusplus/TAT-DQA/issues) or kindly email us at:
Fengbin Zhu [[email protected]](mailto:[email protected]), thank you.