Projects in Awesome Lists tagged with grouped-query-attention

A curated list of projects in awesome lists tagged with grouped-query-attention .

- Recently synced
- Stars

https://github.com/reshalfahsi/image-captioning-mobilenet-llama3

Image Captioning With MobileNet-LLaMA 3

cnn flickr8k-dataset grouped-query-attention image-captioning image-text kv-cache llama3 mobilenetv3 nlp pytorch pytorch-lightning rms-norm rotary-position-embedding transformer

Last synced: 12 Apr 2025

https://github.com/mydarapy/smollm-experiments-with-grouped-query-attention

(Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)

attention grouped-query-attention huggingface huggingface-smol-lm llm ml-efficiency smol smol-lm transformer

Last synced: 13 Mar 2025

https://github.com/estnafinema0/russian-jokes-generator

Transformer Models for Humorous Text Generation. Fine-tuned on Russian jokes dataset with ALiBi, RoPE, GQA, and SwiGLU.Plus a custom Byte-level BPE tokenizer.

alibi bpe-tokenizer grouped-query-attention nlp pytorch rotary-position-embedding swiglu transformer-models

Last synced: 11 Mar 2025

https://github.com/prajeshshrestha/llama-2.0-architecture-and-inference-from-scratch-with-pytorch

grouped-query-attention kv-cache llama2 pytorch pytorch-implementation rotary-positional-embedding

Last synced: 11 Mar 2025

https://github.com/andrewhsugithub/min-llama

my llama3 implementation

grouped-query-attention kv-cache llama3 llm nlp rope swiglu transformers

Last synced: 26 Mar 2025

https://github.com/lucadellalib/llama3

A single-file implementation of LLaMA 3, with support for jitting, KV caching and prompting

grouped-query-attention large-language-models llama3 llm python pytorch rotary-positional-embedding transformers

Last synced: 28 Feb 2025

https://github.com/lukasdrews97/dumblellm

Decoder-only LLM trained on the Harry Potter books.

byte-pair-encoding flash-attention grouped-query-attention large-language-model rotary-position-embedding transformer

Last synced: 05 Apr 2025

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome