https://github.com/tub-rip/event_penguins
The official implementation of "Low-power, Continuous Remote Behavioral Localization with Event Cameras" (CVPR 2024)
https://github.com/tub-rip/event_penguins
Last synced: 8 months ago
JSON representation
The official implementation of "Low-power, Continuous Remote Behavioral Localization with Event Cameras" (CVPR 2024)
- Host: GitHub
- URL: https://github.com/tub-rip/event_penguins
- Owner: tub-rip
- License: mit
- Created: 2024-04-19T14:02:39.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-07-15T15:39:52.000Z (over 1 year ago)
- Last Synced: 2024-07-15T19:05:59.670Z (over 1 year ago)
- Language: Python
- Size: 1.15 MB
- Stars: 9
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- event-vision-index - Event Penguins (CVPR 2024) - rip/event_penguins?style=social"/> (Segmentation / Temporal Action Detection)
README
# Event Penguins (CVPR 2024)
This is the official repository for **Low-power, Continuous Remote Behavioral Localization with Event Cameras** accepted at **CVPR 2024** by [Friedhelm Hamann](https://friedhelmhamann.github.io/), [Suman Ghosh](https://scholar.google.com/citations?user=4QgBeWAAAAAJ&hl=de&oi=sra), [Ignacio Juarez Martinez](https://www.biology.ox.ac.uk/people/ignacio-nacho-juarez-martinez), [Tom Hart](https://scholar.google.co.uk/citations?user=HxUEZy0AAAAJ&hl=en), [Alex Kacelnik](https://users.ox.ac.uk/~kgroup/people/alexkacelnik.shtml), [Guillermo Gallego](https://sites.google.com/view/guillermogallego)
[Project Page](https://tub-rip.github.io/eventpenguins/) | [Paper](https://arxiv.org/pdf/2312.03799.pdf) | [Video](https://www.youtube.com/watch?v=o79wbZh0gU4&feature=youtu.be) | [Data](https://drive.google.com/drive/folders/1VoKEg6CSITmPH27R19tGzyzbUIrmhRDV?usp=drive_link)
[](https://youtu.be/o79wbZh0gU4)
## Table of Contents
1. [Citation](#citation)
2. [Quickstart](#quickstart)
- [Setup](#setup)
- [Preprocessing the data](#preprocessing-the-data)
- [Inference](#inference)
3. [Details](#details)
- [Original Data](#original-data)
- [Pre-processed Data](#pre-processed-data)
- [Annotations](#annotations)
4. [Acknowledgements](#acknowledgements)
5. [In the Press](#in-the-press)
6. [Additional Resources](#additional-resources)
## Citation
If you use this work in your research, please consider citing:
```bibtex
@InProceedings{Hamann24cvpr,
author = {Hamann, Friedhelm and Ghosh, Suman and Martinez, Ignacio Juarez and Hart, Tom and Kacelnik, Alex and Gallego, Guillermo},
title = {Low-power Continuous Remote Behavioral Localization with Event Cameras},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {18612-18621}
}
```
## Quickstart
### Setup
You can use [Miniconda](https://docs.conda.io/en/latest/miniconda.html) to set up an environment:
```bash
conda create --name eventpenguins python=3.8
conda activate eventpenguins
```
Install PyTorch by choosing a command that matches your CUDA version. You can find the compatible commands on the [PyTorch official website](https://pytorch.org/get-started/locally/) (tested with PyTorch 2.2.2), e.g.:
```bash
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia
```
Install other required packages:
```bash
pip install -r requirements.txt
```
### Preprocessing the data
1. Create a folder for the data:
```bash
cd
mkdir data
```
2. [Download the data](https://drive.google.com/drive/folders/1VoKEg6CSITmPH27R19tGzyzbUIrmhRDV?usp=drive_link) and save it in `/data`.
3. Create the pre-processed dataset with the following command:
```bash
python scripts/preprocess.py --data_root data/EventPenguins --output_dir data --recording_info_path config/annotations/recording_info.csv
```
This crops the events according to the pre-annotated nests and stores the recordings according to the split specified in [the paper](https://arxiv.org/pdf/2312.03799.pdf).
### Inference
1. Create a folder for models:
```bash
mkdir models
```
2. Download the pre-trained model weights from [here](https://drive.google.com/drive/folders/1A4JB97u4879oQ88FsgXa_LAU42ffpW1K?usp=sharing) and save them in the `models` folder.
3. Run inference with the following command:
```bash
python scripts/inference.py --config config/exp/inference.yaml --verbose
```
## Details
### Original Data
The [EventPenguins dataset](https://drive.google.com/drive/folders/1VoKEg6CSITmPH27R19tGzyzbUIrmhRDV?usp=drive_link) contains 24 ten-minute recordings, with 16 annotated nests.
An overview of the data can be found in `config/annotations/recording_info.csv`.
Each recording has a `roi_group_id`, which links to the location of the 16 pre-annotated regions of interest, which can be found in `config/annotations/rois` (new set of ROIs when the camera was moved).
The dataset is structured as follows:
```
EventPenguins/
├── _/ # (these folders are referred to as "recordings")
│ ├── frames/
│ │ ├── 000000000000.png
│ │ ├── 000000000001.png
│ │ └── ...
│ ├── events.h5
│ ├── frame_timestamps.txt # [us]
│ └── metadata.yaml
└── ...
```
Please note that we do not use the grayscale frames in our method but provide them for completeness.
### Pre-processed Data
#### Structure
The processed data is stored in a single HDF5 file named `preprocessed.h5`. The file structure is organized as follows:
- Each ten-minute recording is stored in a group labeled by its timestamp (e.g., `22-01-12_17-26-00`).
- Each group (timestamp) contains multiple subgroups, each corresponding to a specific ROI (nest) identified by an ID (e.g., `N01`).
- Each ROI subgroup contains:
- An `events` dataset, where each event is represented as a row `[x, y, t, p]` indicating the event's x-position, y-position, timestamp (us), and polarity, respectively.
- Attributes `height` and `width` indicating the dimensions of the ROI.
#### Attributes
Each subgroup (ROI) has the following attributes:
- `height`: The height of the ROI in pixels.
- `width`: The width of the ROI in pixels.
Each main group (recording timestamp) has the following attribute:
- `split`: Indicates the data split (e.g., `train`, `test`, `validate`) that the recording belongs to.
### Annotations
The annotations are in `config/annotations/annotations.json`.
The structure is very similar to [ActivityNet](https://github.com/activitynet/ActivityNet), with an additional layer to consider different nests.
```json
{
"version": "VERSION 0.0",
"database": {
"_": {
"annotations": {
"": [
{
"label": ,
"segment": [
,
]
},
...
]
}
}
}
}
```
* `_` is the identifier for a ten-minute recording
* `roi_id` is an integer number encoding the nest
* `t_start` and `t_end` are the start and end times of an action in seconds
* The `label` is one of `["ed", "adult_flap", "chick_flap"]`.
`"adult_flap"` and `"chick_flap"` are other types of wing flapping easily confused with the ecstatic display (`ed`).
We provide these labels for completeness, but they are not considered in our method.
## Acknowledgements
The evaluation for activity detection is largely inspired by [ActivityNet](https://github.com/activitynet/ActivityNet). We thank the authors for their excellent work.
## In the Press
* [TU Berlin Press Release](https://www.tu.berlin/ueber-die-tu-berlin/profil/pressemitteilungen-nachrichten/pinguine-in-ekstase)
* [Tagesspiegel Article (German)](https://www.tagesspiegel.de/wissen/tierbeobachtung-in-der-antarktis-das-ratsel-der-ekstatischen-pinguine-11967704.html)
## Additional Resources
* [Recording Software (CoCapture)](https://github.com/tub-rip/CoCapture)
* [Stereo Co-capture System for Recording and Tracking Fish with Frame-and Event Cameras](https://arxiv.org/pdf/2207.07332.pdf)
* [Homepage (TU Berlin, RIP lab)](https://sites.google.com/view/guillermogallego/research/event-based-vision)
* [Homepage (Science Of Intelligence)](https://www.scienceofintelligence.de/)
* [Class at TU Berlin](https://sites.google.com/view/guillermogallego/teaching/event-based-robot-vision)
* [Survey paper](http://rpg.ifi.uzh.ch/docs/EventVisionSurvey.pdf)
* [List of Resources](https://github.com/uzh-rpg/event-based_vision_resources)