https://github.com/tiger-ai-lab/pixelworld
The official code of "PixelWorld: Towards Perceiving Everything as Pixels" [TMLR25]
https://github.com/tiger-ai-lab/pixelworld
llm vlm
Last synced: 5 months ago
JSON representation
The official code of "PixelWorld: Towards Perceiving Everything as Pixels" [TMLR25]
- Host: GitHub
- URL: https://github.com/tiger-ai-lab/pixelworld
- Owner: TIGER-AI-Lab
- Created: 2025-01-31T15:46:12.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-09-12T22:02:20.000Z (9 months ago)
- Last Synced: 2025-09-13T00:18:32.330Z (9 months ago)
- Topics: llm, vlm
- Language: Python
- Homepage: https://tiger-ai-lab.github.io/PixelWorld/
- Size: 10.3 MB
- Stars: 15
- Watchers: 2
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# PixelWorld
The official code of our TMLR-2025 paper [PixelWorld: Towards Perceiving Everything as Pixels](https://arxiv.org/abs/2501.19339).
Refactoring... There may be some problems with the reference relationship between codes.
## Installation
```bash
pip install -r requirements.txt
```
## Get Started
### From Local Files
```bash
python data.py --dataset WikiSS_QADataset --model GPT4o --mode text --prompt base
```
### From Huggingface Dataset
```bash
python data.py --dataset WikiSS_QADataset --model GPT4o --mode text --prompt base --from_hf
```
## Project Site
[PixelWorld](https://tiger-ai-lab.github.io/PixelWorld/)
## Citation
```
@article{lyu2024pixelworld,
title={PixelWorld: Towards Perceiving Everything as Pixels},
author={Lyu, Zhiheng and Ma, Xueguang and Chen, Wenhu},
year={2025},
eprint={2501.19339},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={http://arxiv.org/abs/2501.19339},
}
```