Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/zhaowc-ustc/tabpedia

This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
https://github.com/zhaowc-ustc/tabpedia

deep-learning large-vision-language-models pytorch

Last synced: 1 day ago
JSON representation

This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

Host: GitHub
URL: https://github.com/zhaowc-ustc/tabpedia
Owner: zhaowc-ustc
License: apache-2.0
Created: 2024-08-23T08:29:04.000Z (about 1 month ago)
Default Branch: main
Last Pushed: 2024-08-27T13:12:38.000Z (29 days ago)
Last Synced: 2024-09-22T06:02:15.889Z (4 days ago)
Topics: deep-learning, large-vision-language-models, pytorch
Language: Python
Homepage:
Size: 2.29 MB
Stars: 8
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

📃 Paper | 🤗 ComTQA Dataset | 🤗 TabPedia_v1.0

## News

- [2024/08/27] 🔥 The training code is coming soon.
- [2024/08/27] 🔥 Released inference code for visual table understanding tasks. Due to company copyright restrictions, we utilize [InternLM-7B-chat](https://huggingface.co/internlm/internlm2-chat-7b) as the LLM.

## Installation
- This codebase is tested on CUDA 11.8 and A100-SXM-80G.
```bash
conda create -n TabPedia python=3.10 -y && conda activate TabPedia
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
pip install packaging && pip install ninja && pip install flash-attn==2.3.6 --no-build-isolation --no-cache-dir
pip install -r requirements.txt
git clone https://github.com/InternLM/xtuner.git -b 9bce7b
cd xtuner
pip install -e '.[all]'
```

## Quick Start

* You need to download the official ViT-L/224 from [🤗 Huggingface](https://huggingface.co/openai/clip-vit-large-patch14/tree/main) and save it into ``./pretrained_pth/CLIP-ViT-Large``.
* You need to download our pretrained model from [🤗 TabPedia_v1.0](https://huggingface.co/Zhaowc/TabPedia_v1.0) and save it into ``./pretrained_pth``.
* Change the configuration of **CLIP_L_224px_pretrained_pth** and **llm_name_or_path** in ``tools/configs/Internlm2_7b_chat_TabPedia.py``
* Finally, you can perform evaluation shell to coduct prediction. The results could be found in ``./results``.
```bash
bash eval_TabPedia.sh
```

## Citation

If you find this work useful, please consider citing our paper:
```
@article{zhao2024tabpedia,
title={TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy},
author={Zhao, Weichao and Feng, Hao and Liu, Qi and Tang, Jingqun and Wei, Shu and Wu, Binghong and Liao, Lei and Ye, Yongjie and Liu, Hao and Li, Houqiang and others},
journal={arXiv preprint arXiv:2406.01326},
year={2024}
}
```

## Acknowledgement
- [Xtuner](https://github.com/InternLM/xtuner): the codebase we built upon.