Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/zhaowc-ustc/tabpedia
This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
https://github.com/zhaowc-ustc/tabpedia
deep-learning large-vision-language-models pytorch
Last synced: 1 day ago
JSON representation
This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
- Host: GitHub
- URL: https://github.com/zhaowc-ustc/tabpedia
- Owner: zhaowc-ustc
- License: apache-2.0
- Created: 2024-08-23T08:29:04.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2024-08-27T13:12:38.000Z (29 days ago)
- Last Synced: 2024-09-22T06:02:15.889Z (4 days ago)
- Topics: deep-learning, large-vision-language-models, pytorch
- Language: Python
- Homepage:
- Size: 2.29 MB
- Stars: 8
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
📃 Paper | 🤗 ComTQA Dataset | 🤗 TabPedia_v1.0## News
- [2024/08/27] 🔥 The training code is coming soon.
- [2024/08/27] 🔥 Released inference code for visual table understanding tasks. Due to company copyright restrictions, we utilize [InternLM-7B-chat](https://huggingface.co/internlm/internlm2-chat-7b) as the LLM.## Installation
- This codebase is tested on CUDA 11.8 and A100-SXM-80G.
```bash
conda create -n TabPedia python=3.10 -y && conda activate TabPedia
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
pip install packaging && pip install ninja && pip install flash-attn==2.3.6 --no-build-isolation --no-cache-dir
pip install -r requirements.txt
git clone https://github.com/InternLM/xtuner.git -b 9bce7b
cd xtuner
pip install -e '.[all]'
```## Quick Start
* You need to download the official ViT-L/224 from [🤗 Huggingface](https://huggingface.co/openai/clip-vit-large-patch14/tree/main) and save it into ``./pretrained_pth/CLIP-ViT-Large``.
* You need to download our pretrained model from [🤗 TabPedia_v1.0](https://huggingface.co/Zhaowc/TabPedia_v1.0) and save it into ``./pretrained_pth``.
* Change the configuration of **CLIP_L_224px_pretrained_pth** and **llm_name_or_path** in ``tools/configs/Internlm2_7b_chat_TabPedia.py``
* Finally, you can perform evaluation shell to coduct prediction. The results could be found in ``./results``.
```bash
bash eval_TabPedia.sh
```## Citation
If you find this work useful, please consider citing our paper:
```
@article{zhao2024tabpedia,
title={TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy},
author={Zhao, Weichao and Feng, Hao and Liu, Qi and Tang, Jingqun and Wei, Shu and Wu, Binghong and Liao, Lei and Ye, Yongjie and Liu, Hao and Li, Houqiang and others},
journal={arXiv preprint arXiv:2406.01326},
year={2024}
}
```## Acknowledgement
- [Xtuner](https://github.com/InternLM/xtuner): the codebase we built upon.