https://github.com/showlab/gui-narrator
Repository of GUI Action Narrator
https://github.com/showlab/gui-narrator
Last synced: 12 days ago
JSON representation
Repository of GUI Action Narrator
- Host: GitHub
- URL: https://github.com/showlab/gui-narrator
- Owner: showlab
- Created: 2024-06-16T15:27:03.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-08T09:13:32.000Z (3 months ago)
- Last Synced: 2025-06-07T16:13:26.583Z (26 days ago)
- Language: JavaScript
- Size: 19.9 MB
- Stars: 10
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## GUI Action Narrator: Where and When Did That Action Take Place?
Qinchen Wu, Difei Gao, Kevin Qinghong Lin, Zhuoyu Wu, Xiangwu Guo, Peiran Li, Weichen Zhang, Hengxu Wang, Mike Zheng Shou
## 🤖: Introduction
We introduce GUI action dataset **Act2Cap** as well as an effective framework: **GUI Narrator** for GUI video captioning that utilizes the cursor detection to enhance the interpretation of high-resolution screenshots and keyframe extraction in GUI actions.
## 📋 ToDo List
- [x] Model for Cursor detector and Narrator
- [ ] Code of conduct-- Our model and test benchmark are availble on [](https://huggingface.co/FRank62Wu/ShowUI-Narrator).