Projects in Awesome Lists tagged with grouped-query-attention
A curated list of projects in awesome lists tagged with grouped-query-attention .
https://github.com/reshalfahsi/image-captioning-mobilenet-llama3
Image Captioning With MobileNet-LLaMA 3
cnn flickr8k-dataset grouped-query-attention image-captioning image-text kv-cache llama3 mobilenetv3 nlp pytorch pytorch-lightning rms-norm rotary-position-embedding transformer
Last synced: 12 Apr 2025
https://github.com/mydarapy/smollm-experiments-with-grouped-query-attention
(Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)
attention grouped-query-attention huggingface huggingface-smol-lm llm ml-efficiency smol smol-lm transformer
Last synced: 13 Mar 2025
https://github.com/estnafinema0/russian-jokes-generator
Transformer Models for Humorous Text Generation. Fine-tuned on Russian jokes dataset with ALiBi, RoPE, GQA, and SwiGLU.Plus a custom Byte-level BPE tokenizer.
alibi bpe-tokenizer grouped-query-attention nlp pytorch rotary-position-embedding swiglu transformer-models
Last synced: 11 Mar 2025
https://github.com/andrewhsugithub/min-llama
my llama3 implementation
grouped-query-attention kv-cache llama3 llm nlp rope swiglu transformers
Last synced: 26 Mar 2025
https://github.com/lucadellalib/llama3
A single-file implementation of LLaMA 3, with support for jitting, KV caching and prompting
grouped-query-attention large-language-models llama3 llm python pytorch rotary-positional-embedding transformers
Last synced: 28 Feb 2025
https://github.com/lukasdrews97/dumblellm
Decoder-only LLM trained on the Harry Potter books.
byte-pair-encoding flash-attention grouped-query-attention large-language-model rotary-position-embedding transformer
Last synced: 05 Apr 2025