https://github.com/amazon-science/crossmodal-contrastive-learning

CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations, ICCV 2021
https://github.com/amazon-science/crossmodal-contrastive-learning

computer-vision contrastive-learning multi-modality natural-language-processing transformers video video-captioning video-text-retrieval

Last synced: 16 days ago
JSON representation

CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations, ICCV 2021

Host: GitHub
URL: https://github.com/amazon-science/crossmodal-contrastive-learning
Owner: amazon-science
License: apache-2.0
Created: 2021-10-12T17:46:32.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2022-02-07T06:54:45.000Z (over 3 years ago)
Last Synced: 2023-03-11T11:52:16.091Z (about 2 years ago)
Topics: computer-vision, contrastive-learning, multi-modality, natural-language-processing, transformers, video, video-captioning, video-text-retrieval
Language: Python
Homepage:
Size: 766 KB
Stars: 41
Watchers: 3
Forks: 9
Open Issues: 4
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

README

        # CrossCLR - ICCV 2021



  



This is the official implementation of paper:

### CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations [[Paper]](https://arxiv.org/abs/2103.00020) 

Authors: 

[Mohammadreza Zolfaghari](https://mzolfaghari.github.io/),

[Yi Zhu](https://bryanyzhu.github.io/),

[Peter Gehler](http://gehler.io/),

[Thomas Brox](https://lmb.informatik.uni-freiburg.de/people/brox/index.html),

## Update

##### [Dec 2021] CrossCLR-onlyIntraModality released

## Loss Function

The loss function [`CrossCLR`](https://github.com/amazon-research/crossmodal-contrastive-learning) in `loss.py` takes `video features`  and `text features` as input, and return the loss. 

Usage:

```python

from trainer.loss import CrossCLR_onlyIntraModality

# define loss with a temperature `temp` and weights for negative samples `w`

criterion = CrossCLR_onlyIntraModality(temperature=temp, negative_weight=w)

# features: [bsz, f_dim]

video_features = ...

text_features = ...

# CrossCLR

loss = criterion(video_features, text_features)

...

```

## Qualitative samples



  



## Reference

```

@article{crossclr_aws_21,

  author    = {Mohammadreza Zolfaghari and

               Yi Zhu and

               Peter V. Gehler and

               Thomas Brox},

  title     = {CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations},

  url       = {https://arxiv.org/abs/2109.14910},

  eprinttype = {arXiv},

  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},

  month     = {October},

  year      = {2021},

}

```

## Security

See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.

## License

This project is licensed under the Apache-2.0 License.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/amazon-science/crossmodal-contrastive-learning

Awesome Lists containing this project

README