An open API service indexing awesome lists of open source software.

https://github.com/evolvinglmms-lab/egolife

[CVPR 2025] EgoLife: Towards Egocentric Life Assistant
https://github.com/evolvinglmms-lab/egolife

egocentric-vision omnimodal rag

Last synced: about 1 year ago
JSON representation

[CVPR 2025] EgoLife: Towards Egocentric Life Assistant

Awesome Lists containing this project

README

          

# The EgoLife Project





   



   



   



| ![teaser.png](assets/egolife_teaser.png) |
|:---|
|

Figure 1. The Overview of EgoLife Project. EgoLife is an ambitious egocentric AI project capturing multimodal daily activities of six participants over a week. Using Meta Aria glasses, synchronized third-person cameras, and mmWave sensors, it provides a rich dataset for long-term video understanding. Leveraging this dataset, the project enables AI assistants—powered by EgoGPT and EgoRAG—to support memory, habit tracking, event recall, and task management, advancing real-world egocentric AI applications.

## 🚀 News
🤹 2025-02: We provide [HuggingFace gradio demo]() and [self-deployed demo]() for EgoGPT.

🌟 2025-02: The EgoLife video is released at [HuggingFace](https://huggingface.co/datasets/lmms-lab/EgoLife) and uploaded to [Youtube](https://www.youtube.com/playlist?list=PLlweuFnfdo6F9Fu2Kyhc-kXu3qnaVsYOu) as video collection.

🌟 2025-02: We release the EgoIT-99K dataset at [HuggingFace](https://huggingface.co/collections/lmms-lab/egolife-67c04574c2a9b64ab312c342).

🌟 2025-02: We release the first version of [EgoGPT](./EgoGPT/) and [EgoRAG](./EgoRAG/) codebase.

📖 2025-02: Our arXiv submission is currently on hold. For an overview, please visit our [academic page](https://egolife-ai.github.io/blog/).

🎉 2025-02: The paper is accepted to CVPR 2025. Please be invited to our [online EgoHouse](https://egolife-ai.github.io/).

## What is in this repo?
### 🧠 EgoGPT: Clip-Level Multimodal Understanding
EgoGPT is an **omni-modal vision-language model** fine-tuned on egocentric datasets. It performs **continuous video captioning**, extracting key events, actions, and context from first-person video and audio streams.

**Key Features:**
- **Dense captioning** for visual and auditory events.
- **Fine-tuned for egocentric scenarios** (optimized for EgoLife data).

### 📖 EgoRAG: Long-Context Question Answering
EgoRAG is a **retrieval-augmented generation (RAG) module** that enables long-term reasoning and memory reconstruction. It retrieves **relevant past events** and synthesizes contextualized answers to user queries.

**Key Features:**
- **Hierarchical memory bank** (hourly, daily summaries).
- **Time-stamped retrieval** for context-aware Q&A.

## 📂 Code Structure
```bash
EgoLife/
│── assets/ # General assets used across the project
│── EgoGPT/ # Core module for egocentric omni-modal model
│── EgoRAG/ # Retrieval-augmented generation (RAG) module
│── README.md # Main documentation for the overall project
```
Please dive in to the project of [EgoGPT](./EgoGPT/) and [EgoRAG](./EgoRAG/) for more details.

## 📢 Citation

If you use EgoLife in your research, please cite our work:

```bibtex
@misc{yang2025egolifeegocentriclifeassistant,
title={EgoLife: Towards Egocentric Life Assistant},
author={Jingkang Yang and Shuai Liu and Hongming Guo and Yuhao Dong and Xiamengwei Zhang and Sicheng Zhang and Pengyun Wang and Zitang Zhou and Binzhu Xie and Ziyue Wang and Bei Ouyang and Zhengyu Lin and Marco Cominelli and Zhongang Cai and Yuanhan Zhang and Peiyuan Zhang and Fangzhou Hong and Joerg Widmer and Francesco Gringoli and Lei Yang and Bo Li and Ziwei Liu},
year={2025},
eprint={2503.03803},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.03803},
}
```

## 📝 License
This project is licensed under the S-Lab license. See the [LICENSE](LICENSE) file for details.

## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=EvolvingLMMs-Lab/EgoLife&type=Date)](https://star-history.com/#EvolvingLMMs-Lab/EgoLife&Date)