https://github.com/LTContext/LTContext

[ICCV 2023] How Much Temporal Long-Term Context is Needed for Action Segmentation?
https://github.com/LTContext/LTContext

computer-vision deep-learning iccv2023 video-understanding

Last synced: 4 months ago
JSON representation

[ICCV 2023] How Much Temporal Long-Term Context is Needed for Action Segmentation?

Host: GitHub
URL: https://github.com/LTContext/LTContext
Owner: LTContext
License: other
Created: 2023-08-16T13:47:46.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-06-21T09:52:25.000Z (about 1 year ago)
Last Synced: 2024-07-05T13:54:32.644Z (12 months ago)
Topics: computer-vision, deep-learning, iccv2023, video-understanding
Language: Python
Homepage: https://arxiv.org/abs/2308.11358
Size: 334 KB
Stars: 37
Watchers: 3
Forks: 3
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-temporal-action-segmentation - [code

README

# LTContext

LTContext is an approach for temporal action segmentation, where it leverages
sparse attention to capture the long-term context of a video
and windowed attention to model the local information in the neighboring frames.

Here is an overview of the architecture:

LTContext

The attention mechanism consist of Windowed Attention and Long-term Context Attention:

Attentions

## Citation
If you use this code or our model, please cite our [paper](https://arxiv.org/abs/2308.11358):
```latex
@inproceedings{ltc2023bahrami,
author = {Emad Bahrami and Gianpiero Francesca and Juergen Gall},
title = {How Much Temporal Long-Term Context is Needed for Action Segmentation?},
booktitle = {IEEE International Conference on Computer Vision (ICCV)},
year = {2023}
}
```

## Installation

To create the [conda](https://docs.conda.io/en/latest/) environment run the following command:
```bash
conda env create --name ltc --file environment.yml
source activate ltc
```

## Dataset Preparation
The features and annotations of the Breakfast dataset can be downloaded from
[link 1](https://mega.nz/#!O6wXlSTS!wcEoDT4Ctq5HRq_hV-aWeVF1_JB3cacQBQqOLjCIbc8)
or
[link 2](https://zenodo.org/record/3625992#.Xiv9jGhKhPY).

### Assembly101
Follow the instructions at [Assembly101-Download-Scripts](https://github.com/assembly-101/assembly101-download-scripts) to download the TSM features.
The annotations for action segmentation can be downloaded from [assembly101-annotations](https://drive.google.com/drive/folders/1QoT-hIiKUrSHMxYBKHvWpW9Z9aCznJB7).
After downloading the annotation put `coarse-annotations` inside `data/assembly101` folder.
We noticed loading from `numpy` can be faster, you can convert the `.lmdb` features to `numpy` and use `LTContext_Numpy.yaml` config.

## Training and Evaluating the model

Here is an example of the command to train the model.
```bash
python run_net.py \
--cfg configs/Breakfast/LTContext.yaml \
DATA.PATH_TO_DATA_DIR [path_to_your_dataset] \
OUTPUT_DIR [path_to_logging_dir]
```
For more options look at `ltc/config/defaults.py`.

The value of `DATA.PATH_TO_DATA_DIR` for assembly101 should be the path to the folder containing the TSM features.

If you want to evaluate a pretrained model use the following command.
```bash
python run_net.py \
--cfg configs/Breakfast/LTContext.yaml \
DATA.PATH_TO_DATA_DIR [path_to_your_dataset] \
TRAIN.ENABLE False \
TEST.ENABLE True \
TEST.DATASET 'breakfast' \
TEST.CHECKPOINT_PATH [path_to_trained_model] \
TEST.SAVE_RESULT_PATH [path_to_save_result]
```
Check the [model card](MODEL_CARD.md) to download the pretrained models.

### Acknowledgement
The structure of the code is inspired by [SlowFast](https://github.com/facebookresearch/SlowFast).
The MultiHeadAttention is based on [xFormers](https://github.com/facebookresearch/xformers).
We thank the authors of these codebases.

## License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/LTContext/LTContext

Awesome Lists containing this project

README