Projects in Awesome Lists tagged with tokenization-free
A curated list of projects in awesome lists tagged with tokenization-free .
https://github.com/contrebande-labs/charred
CHARacter-awaRE Diffusion: Multilingual Character-Aware Encoders for Font-Aware Diffusers That Can Actually Spell
bert canine character-aware controlnet diffusion diffusion-models fonts stable-diffusion tokenization-free transformers transformers-models typography unicode utf-16 utf-8
Last synced: 27 Jun 2025
https://github.com/shjwudp/megabyte
A PyTorch implementation of MEGABYTE. This multi-scale transformer architecture has the excellent features of tokenization-free and sub-quadratic attention. The paper link: https://arxiv.org/abs/2305.07185
deep-learning language-model sub-quadratic-attention tokenization-free
Last synced: 15 Jul 2025
https://github.com/zjysteven/awesome-byte-llm
A curated list of papers and resources on byte-based large language models (LLMs) — models that operate directly on raw bytes.
awesome-list byte-llms deep-learning deep-neural-networks foundation-models large-language-models machine-learning multimodal-large-language-models research-paper tokenization-free transformers
Last synced: 09 Feb 2026