Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kyegomez/mc-vit

Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"
https://github.com/kyegomez/mc-vit

ai multi-modal multi-modal-transformers multi-modality open-source transformer transformers vit

Last synced: 11 days ago
JSON representation

Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"

Host: GitHub
URL: https://github.com/kyegomez/mc-vit
Owner: kyegomez
License: mit
Created: 2024-02-09T04:10:38.000Z (10 months ago)
Default Branch: main
Last Pushed: 2024-10-07T07:19:47.000Z (about 1 month ago)
Last Synced: 2024-10-29T07:09:50.359Z (22 days ago)
Topics: ai, multi-modal, multi-modal-transformers, multi-modality, open-source, transformer, transformers, vit
Language: Python
Homepage: https://discord.gg/GYbXvDGevY
Size: 2.17 MB
Stars: 16
Watchers: 3
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

        [![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# MC -VIT

Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"

## Install

`$ pip install mcvit`

## Usage

```python

import torch

from mcvit.model import MCViT

# Initialize the MCViT model

mcvit = MCViT(

    dim=512,

    attn_seq_len=256,

    dim_head=64,

    dropout=0.1,

    chunks=16,

    depth=12,

    cross_attn_heads=8,

)

# Create a random tensor to represent a video

x = torch.randn(

    1, 3, 256, 256, 256

)  # (batch, channels, frames, height, width)

# Pass the tensor through the model

output = mcvit(x)

print(

    output.shape

)  # Outputs the shape of the tensor after passing through the model

```

# License

MIT