{"id":13665185,"url":"https://github.com/csiro-robotics/TCE","last_synced_at":"2025-04-26T08:31:43.447Z","repository":{"id":110049147,"uuid":"258044229","full_name":"csiro-robotics/TCE","owner":"csiro-robotics","description":"This repository contains the code implementation used in the paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE). ","archived":false,"fork":false,"pushed_at":"2021-03-16T04:20:23.000Z","size":24899,"stargazers_count":52,"open_issues_count":0,"forks_count":2,"subscribers_count":9,"default_branch":"master","last_synced_at":"2024-08-03T06:01:48.841Z","etag":null,"topics":["action-recognition","computer-vision","contrastive-learning","contrastive-loss","deep-learning","embeddings","hmdb51","kinetics-datasets","metric-learning","pytorch","representation-learning","self-supervised-learning","tsne-visualisations","ucf-101"],"latest_commit_sha":null,"homepage":"https://csiro-robotics.github.io/TCE-Webpage/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/csiro-robotics.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2020-04-22T23:31:47.000Z","updated_at":"2024-01-04T16:45:06.000Z","dependencies_parsed_at":"2023-04-13T23:03:03.910Z","dependency_job_id":null,"html_url":"https://github.com/csiro-robotics/TCE","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/csiro-robotics%2FTCE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/csiro-robotics%2FTCE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/csiro-robotics%2FTCE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/csiro-robotics%2FTCE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/csiro-robotics","download_url":"https://codeload.github.com/csiro-robotics/TCE/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224031926,"owners_count":17244361,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["action-recognition","computer-vision","contrastive-learning","contrastive-loss","deep-learning","embeddings","hmdb51","kinetics-datasets","metric-learning","pytorch","representation-learning","self-supervised-learning","tsne-visualisations","ucf-101"],"created_at":"2024-08-02T06:00:25.941Z","updated_at":"2024-11-11T00:30:33.333Z","avatar_url":"https://github.com/csiro-robotics.png","language":"Python","funding_links":[],"categories":["2020"],"sub_categories":["Arxiv (with code or interesting)"],"readme":"# Temporally Coherent Embeddings for Self-Supervised Video Representation Learning\nThis repository contains the code implementation used in the ICPR2020 paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE). \\[[arXiv](https://arxiv.org/abs/2004.02753)] \\[[Website](https://csiro-robotics.github.io/TCE-Webpage/)]  Our contributions in this repository are:\n- A Pytorch implementation of the self-supervised training used in the TCE paper\n- A Pytorch implementation of action recognition fine-tuning\n- Pre-trained checkpoints for models trained using the TCE self-supervised training paradigm\n- A Pytorch implementation of t-SNE visualisations of the network output\n\n![Network Architecture](images/TCE.png)\n\nWe benchmark our code on Split 1 of the UCF101 action recognition dataset, providing pre-trained models for our downstream and upstream training.  See [Models](#models) for our provided models and Getting Started (#getting-started) for for instructions on training and evaluation.\n\nIf you find this repo useful for your research, please consider citing the paper\n ```\n@inproceedings{knights2020tce,\n  title={Temporally Coherent Embeddings for Self-Supervised Video Representation Learning},\n  author={Joshua Knights and Ben Harwood and Daniel Ward and Anthony Vanderkop and Olivia Mackenzie-Ross and Peyman Moghadam},\n booktitle={25th International Conference on Pattern Recognition (ICPR)},\n  year={2020}\n}\n\n ```\n\n\n## Updates\n- 23/04/2020 : Initial Commit\n- 30/11/2020 : ICPR Update\n\n## Table of Contents\n\n- [Data Preparation](#data-preparation)\n- [Installation](#installation)\n- [Models](#models)\n- [Getting Started](#getting-started)\n- [Acknowledgements](#acknowledgements)\n\n\n## Data Preparation \n\u003ca name=\"data-preparation\"\u003e\u003c/a\u003e\n\n### Kinetics400\nKinetics400 videos can be downloaded and split into frames directly from [Showmax/kinetics-downloader](https://github.com/Showmax/kinetics-downloader)\n\nThe file directory should have the following layout:\n```\n├── kinetics400/train\n    |\n    ├── CLASS_001\n    ├── CLASS_002\n    .\n    .\n    .\n    CLASS_400\n        | \n        ├── VID_001\n        ├── VID_002\n        .\n        .\n        .\n        ├── VID_###\n            | \n            ├── frame1.jpg\n            ├── frame2.jpg\n            .\n            .\n            .\n            ├── frame###.jpg\n```\nOnce the dataset is downloaded and split into frames, edit the following parameters in config/default.py to point towards the frames and splits:\n- DATASET.KINETICS400.FRAMES_PATH = /path/to/kinetics400/train\n\n### UCF101\n\nUCF101 frames and splits can be downloaded directly from [feichtenhofer/twostreamfusion](https://github.com/feichtenhofer/twostreamfusion)\n\n```\nwget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.001\nwget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.002\nwget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.003\n\ncat ucf101_jpegs_256.zip* \u003e ucf101_jpegs_256.zip\nunzip ucf101_jpegs_256.zip\n```\nThe file directory should have the following layout:\n\n```\n├── UCF101\n    |\n    ├── v_{_CLASS_001}_g01_c01\n    .   | \n    .   ├── frame000001.jpg\n    .   ├── frame000002.jpg \n    .   .\n    .   .\n    .   ├── frame000###.jpg\n    .\n    ├── v_{_CLASS_101}_g##_c##\n        | \n        ├── frame000001.jpg\n        ├── frame000002.jpg \n        .\n        .\n        ├── frame000###.jpg\n```\n\nOnce the dataset is downloaded and decompressed, edit the following parameters in config/default.py to point towards the frames and splits:\n- DATASET.UCF101.FRAMES_PATH = /path/to/UCF101_frames\n- DATASET.UCF101.SPLITS_PATH = /path/to/UCF101_splits\n\n\n\n\n\n## Installation\n\u003ca name=\"installation\"\u003e\u003c/a\u003e\n\nTCE is built using Python == 3.7.1 and PyTorch == 1.7.0\n\nWe use Conda to setup the Python environment for this repository.  In order to create the environment, run the following commands from the root directory:\n\n```\nconda env create -f TCE.yaml\nconda activate TCE\n```\n\nOnce this is done, also specify a path to save assets (such as dataset pickles for faster setup) to in config.default.py:\n- ASSETS_PATH = /path/to/assets/folder\n\n\n\n## Models\n\u003ca name=\"models\"\u003e\u003c/a\u003e\n\n| Architecture \t| Pre-Training Dataset \t| Link                                                           \t|\n|--------------\t|----------------------\t|----------------------------------------------------------------\t|\n| ResNet-18    \t| Kinetics400          \t| [Link](https://cloudstor.aarnet.edu.au/plus/s/kNQKw5ATTbyamg2) \t|\n| ResNet-50    \t| Kinetics400          \t| [Link](https://cloudstor.aarnet.edu.au/plus/s/HbWxmhcUbfzQIQf) \t|\n\n## Getting Started\n\n### Self-Supervised Training\nWe provide a script for pre-training with the Kinetics400 dataset using TCE, pretrain.py.  To train, run the following script:\n\n```\npython finetune.py \\\n    --cfg config/pretrain_kinetics400miningr_finetune_UCF101_resnet18.yaml  \\\n    TRAIN.PRETRAINING.SAVEDIR /path/to/savedir \n```\n\nIf resuming from a previous pre-training checkpoint, set the flag `TRAIN.PRETRAINING.CHECKPOINT` to the path to the checkpoint to resume from\n\n### Fine-tuning for action recognition\nWe provide a fine-tuning script for action recognition on the UCF-101 dataset, finetune.py.  To train, run the following script:\n\n```\npython finetune.py \\\n    --cfg config/pretrain_kinetics400miningr_finetune_UCF101_resnet18.yaml \\\n    TRAIN.FINETUNING.CHECKPOINT \"/path/to/pretrained_checkpoint\" \\\n    TRAIN.FINETUNING.SAVEDIR \"/path/to/savedir\"\n```\n\nIf resuming training from an earlier finetuning checkpoint, set the flag `TRAIN.FINETUNING.RESUME` to True \n\n\n\n\n### Visualisation\n\n![vid](images/bowling_tsne_example.gif)\n\nIn order to demonstrate the ability of our approach to create temporally coherent embeddings, we provide a package to create t-SNE visualisations of our features similar to those found in the paper.  This package can also be applied to other approaches and network architectures.\n\nThe files in this repository used for generating t-SNE visualisations are:\n- `visualise_tsne.py` Is a wrapper for t-SNE and our network architecture for end-to-end generation of the t-SNE\n- `utils/tsne_utils.py` Contains t-SNE functionality for reducing the dimensionality of an array of embedded features for plotting, as well as tools to create an animated visualisation of the embedding's behaviour over time\n\nThe following flags can be used as inputs for `make_tsne.py`:\n- `--cfg` : Path to config file\n- `--target` : Path to video to visualise t-SNE for.  This video can either be a video file (avi, mp4) or a directory of images representing frames\n- `--ckpt` : Path to the model chekpoint to visualise the embedding space for\n- `--gif` : Use to visualise the change in the embedding space over time alongside the input video as a gif file\n- `--fps` : Set the framerate of the gif\n- `--save` : Path to save the output t-SNE to\n\nTo visualise the embeddings from TCE, download our self-supervised model above and use the following command to visualise our embedding space as a gif:\n\n```\npython visualise_tsne.py\n    --cfg config/pretrain_kinetics400miningr_finetune_UCF101_resnet18.yaml \\\n    --target \"/path/to/target/video\" \\\n    --ckpt \"/path/to/TCE_checkpoint\" \\\n    --gif \\\n    --fps 25 \\\n    --save \"/path/to/save/folder/t-SNE.gif\"\n```\n\nAlternatively, to visualise the t-SNE as a PNG image use the following:\n\n```\npython visualise_tsne.py\n    --cfg config/pretrain_kinetics400miningr_finetune_UCF101_resnet18.yaml \\\n    --target \"/path/to/target/video\" \\\n    --ckpt \"/path/to/TCE_checkpoint\" \\\n    --save \"/path/to/save/folder/t-SNE.png\"\n```\n\n\u003ca name=\"acknowledgements\"\u003e\u003c/a\u003e\n\n\n\n## Acknowledgements\nParts of this code base are derived from Yonglong Tian's unsupervised learning algorithm [Contrastive Multiview Coding](https://github.com/HobbitLong/CMC) and Jeffrey Huang's implementation of [action recognition](https://github.com/jeffreyyihuang/two-stream-action-recognition).\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcsiro-robotics%2FTCE","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcsiro-robotics%2FTCE","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcsiro-robotics%2FTCE/lists"}