An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with multi-head-attention

A curated list of projects in awesome lists tagged with multi-head-attention .

https://github.com/poloclub/dodrio

Exploring attention weights in transformer-based models with linguistic knowledge.

attention-mechanism deep-learning interactive-visualizations multi-head-attention nlp transformer visualization

Last synced: 13 May 2025

https://github.com/monk1337/various-attention-mechanisms

This repository contain various types of attention mechanism like Bahdanau , Soft attention , Additive Attention , Hierarchical Attention etc in Pytorch, Tensorflow, Keras

attention attention-lstm attention-mechanism attention-mechanisms attention-model attention-network bahdanau-attention hierarchical-attention keras luong-attention multi-head-attention pytorch scaled-dot-product-attention self-attention sentence-attention

Last synced: 10 Apr 2025

https://github.com/bruce-lee-ly/decoding_attention

Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.

cuda cuda-core decoding-attention flash-attention flashinfer flashmla gpu gqa inference large-language-model llm mha mla mqa multi-head-attention nvidia

Last synced: 05 May 2025

https://github.com/bruce-lee-ly/flash_attention_inference

Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.

cuda cutlass flash-attention flash-attention-2 gpu inference large-language-model llm mha multi-head-attention nvidia tensor-core

Last synced: 13 Apr 2025

https://github.com/shreyas-kowshik/nlp4if

Code for the runners up entry on the English subtask on the Shared-Task-On-Fighting the COVID-19 Infodemic, NLP4IF workshop, NAACL'21.

deep-learning multi-head-attention multi-task-learning naacl2021 natural-language-processing

Last synced: 20 Nov 2024

https://github.com/liaoyanqing666/transformer_pytorch

完整的原版transformer程序,complete origin transformer program

beginner multi-head-attention positional-encoding python pytorch transformer

Last synced: 28 Jan 2025

https://github.com/junfanz1/minigpt-and-deepseek-mla-multi-head-latent-attention

An efficient and scalable attention module designed to reduce memory usage and improve inference speed in large language models. Designed and implemented the Multi-Head Latent Attention (MLA) module as a drop-in replacement for traditional multi-head attention (MHA) in large language models.

attention-mechanism deepseek llm mla multi-head-attention pytorch

Last synced: 13 Apr 2025