Projects in Awesome Lists tagged with attention-mechanisms

https://github.com/lucidrains/palm-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

artificial-intelligence attention-mechanisms deep-learning human-feedback reinforcement-learning transformers

Last synced: 17 Dec 2024

https://github.com/lucidrains/PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

artificial-intelligence attention-mechanisms deep-learning human-feedback reinforcement-learning transformers

Last synced: 31 Oct 2024

https://github.com/lucidrains/musiclm-pytorch

Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch

artificial-intelligence attention-mechanisms deep-learning music-synthesis transformers

Last synced: 19 Dec 2024

https://github.com/lucidrains/audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

artificial-intelligence attention-mechanisms audio-synthesis deep-learning transformers

Last synced: 17 Dec 2024

https://github.com/lucidrains/toolformer-pytorch

Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI

api-calling artificial-intelligence attention-mechanisms deep-learning transformers

Last synced: 17 Dec 2024

https://github.com/lucidrains/make-a-video-pytorch

Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch

artificial-intelligence attention-mechanisms axial-convolutions deep-learning text-to-video

Last synced: 20 Dec 2024

https://github.com/lucidrains/alphafold3-pytorch

Implementation of Alphafold 3 in Pytorch

artificial-intelligence attention-mechanisms deep-learning denoising-diffusion protein-structure-prediction transformers

Last synced: 17 Dec 2024

https://github.com/lucidrains/muse-maskgit-pytorch

Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch

artificial-intelligence attention-mechanisms deep-learning text-to-image transformers

Last synced: 21 Dec 2024

https://github.com/lucidrains/phenaki-pytorch

Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 minutes in length, in Pytorch

artificial-intelligence attention-mechanisms deep-learning imagination-machine text-to-video transformers

Last synced: 19 Dec 2024

https://github.com/lucidrains/meshgpt-pytorch

Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch

artificial-intelligence attention-mechanisms deep-learning mesh-generation transformers

Last synced: 19 Dec 2024

https://github.com/kyegomez/longnet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

artificial-intelligence attention attention-is-all-you-need attention-mechanisms chatgpt context-length gpt3 gpt4 machine-learning transformer

Last synced: 22 Dec 2024

https://github.com/kyegomez/LongNet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

artificial-intelligence attention attention-is-all-you-need attention-mechanisms chatgpt context-length gpt3 gpt4 machine-learning transformer

Last synced: 18 Nov 2024

https://github.com/julesbelveze/time-series-autoencoder

PyTorch Dual-Attention LSTM-Autoencoder For Multivariate Time Series

attention-mechanisms autoencoder forecasting lstm-autoencoder multivariate-timeseries pytorch time-series

Last synced: 20 Dec 2024

https://github.com/lucidrains/megabyte-pytorch

Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch

artificial-intelligence attention-mechanisms deep-learning learned-tokenization long-context transformers

Last synced: 19 Dec 2024

https://github.com/lucidrains/MEGABYTE-pytorch

Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch

artificial-intelligence attention-mechanisms deep-learning learned-tokenization long-context transformers

Last synced: 16 Nov 2024

https://github.com/lucidrains/magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch

artificial-intelligence attention-mechanisms deep-learning finite-scalar-quantization transformers video-generation

Last synced: 19 Dec 2024

https://github.com/lucidrains/itransformer

Unofficial implementation of iTransformer - SOTA Time Series Forecasting using Attention networks, out of Tsinghua / Ant group

artificial-intelligence attention-mechanisms deep-learning time-series-forecasting transformers

Last synced: 16 Dec 2024

https://github.com/lucidrains/bs-roformer

Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs

artificial-intelligence attention-mechanisms deep-learning music-source-separation transformers

Last synced: 20 Dec 2024

https://github.com/lucidrains/BS-RoFormer

Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs

artificial-intelligence attention-mechanisms deep-learning music-source-separation transformers

Last synced: 22 Nov 2024

https://github.com/landskape-ai/triplet-attention

Official PyTorch Implementation for "Rotate to Attend: Convolutional Triplet Attention Module." [WACV 2021]

arxiv attention-mechanism attention-mechanisms computer-vision convolutional-neural-networks deep-learning detection gradcam imagenet paper triplet-attention

Last synced: 15 Nov 2024

https://github.com/lucidrains/recurrent-memory-transformer-pytorch

Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch

artificial-intelligence attention-mechanisms deep-learning long-context memory recurrence transformers

Last synced: 18 Dec 2024

https://github.com/lucidrains/local-attention

An implementation of local windowed attention for language modeling

artificial-intelligence attention-mechanisms deep-learning

Last synced: 19 Dec 2024

https://github.com/lucidrains/robotic-transformer-pytorch

Implementation of RT1 (Robotic Transformer) in Pytorch

artificial-intelligence attention-mechanisms deep-learning robotics transformers

Last synced: 20 Dec 2024

https://github.com/lucidrains/q-transformer

Implementation of Q-Transformer, Scalable Offline Reinforcement Learning via Autoregressive Q-Functions, out of Google Deepmind

artificial-intelligence attention-mechanisms deep-learning offline-learning q-learning robotics transformers

Last synced: 20 Dec 2024

https://github.com/lucidrains/medical-chatgpt

Implementation of ChatGPT, but tailored towards primary care medicine, with the reward being able to collect patient histories in a thorough and efficient manner and come up with a reasonable differential diagnosis

artificial-intelligence attention-mechanisms deep-learning medicine transformers

Last synced: 16 Dec 2024

https://github.com/lucidrains/mmdit

Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch

artificial-intelligence attention-mechanisms deep-learning multi-modal-attention

Last synced: 22 Dec 2024

https://github.com/cbaziotis/neat-vision

Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Processing (NLP) tasks. (framework-agnostic)

attention attention-mechanism attention-mechanisms attention-scores attention-visualization deep-learning deep-learning-library deep-learning-visualization natural-language-processing nlp self-attention self-attentive-rnn text-visualization visualization vuejs

Last synced: 06 Nov 2024

https://github.com/lucidrains/equiformer-pytorch

Implementation of the Equiformer, SE3/E3 equivariant attention network that reaches new SOTA, and adopted for use by EquiFold for protein folding

artificial-intelligence attention-mechanisms deep-learning equivariance molecules protein-folding transformers

Last synced: 21 Dec 2024

https://github.com/lucidrains/colt5-attention

Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch

artificial-intelligence attention-mechanisms deep-learning efficient-attention routing

Last synced: 21 Dec 2024

https://github.com/lucidrains/block-recurrent-transformer-pytorch

Implementation of Block Recurrent Transformer - Pytorch

artificial-intelligence attention-mechanisms deep-learning long-context-attention long-context-transformers memory recurrence

Last synced: 22 Dec 2024

https://github.com/lucidrains/simple-hierarchical-transformer

Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT

artificial-intelligence attention-mechanisms deep-learning hierarchical-predictive-coding transformers

Last synced: 15 Dec 2024

https://github.com/lucidrains/mega-pytorch

Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena

artificial-intelligence attention-mechanisms deep-learning exponential-moving-average long-range-arena

Last synced: 16 Dec 2024

https://github.com/lucidrains/flash-attention-jax

Implementation of Flash Attention in Jax

artificial-intelligence attention-mechanisms deep-learning jax long-context-attention

Last synced: 15 Dec 2024

https://github.com/lucidrains/recurrent-interface-network-pytorch

Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in Pytorch

artificial-intelligence attention-mechanisms deep-learning denoising-diffusion image-generation latents video-generation

Last synced: 18 Dec 2024

https://github.com/lucidrains/flash-cosine-sim-attention

Implementation of fused cosine similarity attention in the same style as Flash Attention

artificial-intelligence attention-mechanisms deep-learning

Last synced: 16 Dec 2024

https://github.com/kyegomez/mambatransformer

Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling

ai artificial-intelligence attention-is-all-you-need attention-mechanisms gpt4 language machine-learning multimodal neural-network neural-networks pytorch recurrent-neural-networks rnns ssm tensorflow zeta

Last synced: 17 Dec 2024

https://github.com/lucidrains/calm-pytorch

Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind

artificial-intelligence attention-mechanisms cross-attention deep-learning transformers

Last synced: 15 Dec 2024

https://github.com/lucidrains/mixture-of-attention

Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts

artificial-intelligence attention-mechanisms deep-learning mixture-of-experts routed-attention

Last synced: 20 Dec 2024

https://github.com/monk1337/various-attention-mechanisms

This repository contain various types of attention mechanism like Bahdanau , Soft attention , Additive Attention , Hierarchical Attention etc in Pytorch, Tensorflow, Keras

attention attention-lstm attention-mechanism attention-mechanisms attention-model attention-network bahdanau-attention hierarchical-attention keras luong-attention multi-head-attention pytorch scaled-dot-product-attention self-attention sentence-attention

Last synced: 24 Nov 2024

https://github.com/lucidrains/zorro-pytorch

Implementation of Zorro, Masked Multimodal Transformer, in Pytorch

artificial-intelligence attention-mechanisms deep-learning masking multimodal transformers

Last synced: 19 Dec 2024

https://github.com/lucidrains/diffusion-policy

Implementation of Diffusion Policy, Toyota Research's supposed breakthrough in leveraging DDPMs for learning policies for real-world Robotics

artificial-intelligence attention-mechanisms deep-learning denoising-diffusion robotics transformers

Last synced: 17 Dec 2024

https://github.com/lucidrains/taylor-series-linear-attention

Explorations into the recently proposed Taylor Series Linear Attention

artificial-intelligence attention-mechanisms deep-learning linear-attention

Last synced: 20 Dec 2024

https://github.com/lucidrains/agent-attention-pytorch

Implementation of Agent Attention in Pytorch

artificial-intelligence attention-mechanisms deep-learning linear-attention

Last synced: 21 Dec 2024

https://github.com/johnsmithm/multi-heads-attention-image-classification

Multi heads attention for image classification

attention-is-all-you-need attention-mechanisms computer-vision keras tensorflow

Last synced: 25 Nov 2024

https://github.com/lucidrains/transframer-pytorch

Implementation of Transframer, Deepmind's U-net + Transformer architecture for up to 30 seconds video generation, in Pytorch

artificial-intelligence attention-mechanisms deep-learning transformers unet video-generation

Last synced: 22 Oct 2024

https://github.com/lucidrains/complex-valued-transformer

Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"

artificial-intelligence attention-mechanisms complex-networks deep-learning transformers

Last synced: 01 Nov 2024

https://github.com/lucidrains/kalman-filtering-attention

Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"

attention-mechanisms kalman-filtering

Last synced: 10 Dec 2024

https://github.com/lucidrains/equiformer-diffusion

Implementation of Denoising Diffusion for protein design, but using the new Equiformer (successor to SE3 Transformers) with some additional improvements

artificial-intelligence attention-mechanisms deep-learning denoising-diffusion equivariant-networks protein-design

Last synced: 10 Dec 2024

https://github.com/lucidrains/flash-genomics-model

My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)

artificial-intelligence attention-mechanisms deep-learning genomics long-context transformers

Last synced: 22 Oct 2024

https://github.com/kyegomez/jamba

PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"

ai artificial-neural-networks attention-is-all-you-need attention-mechanism attention-mechanisms gpt ml ssm transformers

Last synced: 09 Nov 2024

https://github.com/lucidrains/coordinate-descent-attention

Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk

artificial-intelligence attention-mechanisms deep-learning

Last synced: 22 Oct 2024

https://github.com/lucidrains/autoregressive-linear-attention-cuda

CUDA implementation of autoregressive linear attention, with all the latest research findings

artificial-intelligence attention-mechanisms cuda deep-learning linear-attention

Last synced: 22 Oct 2024

https://github.com/lucidrains/pause-transformer

Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount of time on any token

adaptive-computation artificial-intelligence attention-mechanisms deep-learning transformers

Last synced: 02 Nov 2024

https://github.com/kyegomez/sparseattention

Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with Sparse Transformers"

artificial-intelligence attention-is-all-you-need attention-mechanism attention-mechanisms machine-learning sparse-attn sparse-matrix

Last synced: 09 Nov 2024

https://github.com/kyegomez/mambaformer

Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks"

ai attention attention-is-all-you-need attention-mechanisms mamba ml ssms transformer

Last synced: 16 Nov 2024

https://github.com/kyegomez/palm2-vadapter

Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter"

ai attention attention-is-all-you-need attention-mechanisms deeplearning ml models multi-modal neural-nets transformers

Last synced: 09 Nov 2024

https://github.com/kyegomez/kosmosg

My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"

attention-is-all-you-need attention-mechanism attention-mechanisms computer-vision multimodal multimodal-learning

Last synced: 09 Nov 2024

https://github.com/kyegomez/mobilevlm

Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant for Mobile Devices"

ai artificial-intelligence attention attention-is-all-you-need attention-mechanisms machine-learning ml

Last synced: 09 Nov 2024

https://github.com/kyegomez/flashmha

An simple pytorch implementation of Flash MultiHead Attention

artificial-intelligence artificial-neural-networks attention attention-mechanisms attentionisallyouneed flash-attention gpt4 transformer

Last synced: 09 Nov 2024

https://github.com/kyegomez/mgqa

The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints"

artificial-intelligence attentio attention attention-is-all-you-need attention-lstm attention-mechanism attention-mechanisms gpt4 multimodal multiqueryattention

Last synced: 09 Nov 2024

https://github.com/kyegomez/shallowff

Zeta implemantion of "Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers"

artificial-intelligence attention attention-is-all-you-need attention-mechanism attention-mechanisms feedforward transformer transformer-encoder transformer-models transformers-models

Last synced: 09 Nov 2024

https://github.com/kyegomez/celestial-1

Omni-Modality Processing, Understanding, and Generation

attention attention-is-all-you-need attention-mechanisms gpt-4 gpt4 multi-modal multimodal multimodal-deep-learning multimodality omnimodal openai

Last synced: 09 Nov 2024

https://github.com/kyegomez/hedgehog

Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"

ai attention attention-is-all-you-need attention-mechanisms feedforward ffns ml mlps multi-modal neural-nets open-source opensource-ai softmax

Last synced: 09 Nov 2024