Projects in Awesome Lists tagged with moe
A curated list of projects in awesome lists tagged with moe .
https://github.com/hiyouga/llama-factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
agent ai chatglm fine-tuning gpt instruction-tuning language-model large-language-models llama llama3 llm lora mistral moe peft qlora quantization qwen rlhf transformers
Last synced: 02 Jan 2026
https://github.com/hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
agent ai chatglm fine-tuning gpt instruction-tuning language-model large-language-models llama llama3 llm lora mistral moe peft qlora quantization qwen rlhf transformers
Last synced: 14 Mar 2025
https://github.com/sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
cuda deepseek deepseek-llm deepseek-r1 deepseek-r1-zero deepseek-v3 inference llama llama3 llama3-1 llava llm llm-serving moe pytorch transformer vlm
Last synced: 12 May 2025
https://github.com/czy0729/Bangumi
:electron: An unofficial https://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录,bgm.tv 第三方客户端。为移动端重新设计,内置大量加强的网页端难以实现的功能,且提供了相当的自定义选项。 目前已适配 iOS / Android / WSA、mobile / 简单 pad、light / dark theme、移动端网页。
android android-app bangumi design expo ios ios-app mobx moe react react-native
Last synced: 13 Apr 2025
https://github.com/czy0729/bangumi
:electron: An unofficial https://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录,bgm.tv 第三方客户端。为移动端重新设计,内置大量加强的网页端难以实现的功能,且提供了相当的自定义选项。 目前已适配 iOS / Android / WSA、mobile / 简单 pad、light / dark theme、移动端网页。
android android-app bangumi design expo ios ios-app mobx moe react react-native
Last synced: 13 May 2025
https://github.com/pku-yuangroup/moe-llava
Mixture-of-Experts for Large Vision-Language Models
large-vision-language-model mixture-of-experts moe multi-modal
Last synced: 14 May 2025
https://github.com/PKU-YuanGroup/MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
large-vision-language-model mixture-of-experts moe multi-modal
Last synced: 16 Mar 2025
https://github.com/moonshotai/moba
MoBA: Mixture of Block Attention for Long-Context LLMs
flash-attention llm llm-serving llm-training moe pytorch transformer
Last synced: 14 May 2025
https://github.com/MoonshotAI/MoBA
MoBA: Mixture of Block Attention for Long-Context LLMs
flash-attention llm llm-serving llm-training moe pytorch transformer
Last synced: 31 Mar 2025
https://github.com/pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
continual-pre-training expert-partition llama llm mixture-of-experts moe
Last synced: 09 May 2025
https://github.com/microsoft/tutel
Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4
deepseek llm mixture-of-experts moe pytorch
Last synced: 15 May 2025
https://github.com/sail-sg/adan
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
adan artificial-intelligence bert-model convnext cuda-programming deep-learning diffusion dreamfusion fairseq gpt2 llm-training llms mae moe optimizer pytorch resnet timm transformer-xl vit
Last synced: 07 Jul 2025
https://github.com/microsoft/Tutel
Tutel MoE: An Optimized Mixture-of-Experts Implementation
deepseek deepseek-r1 llm mixture-of-experts moe pytorch
Last synced: 29 Mar 2025
https://github.com/open-compass/mixtralkit
A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
Last synced: 12 Apr 2025
https://github.com/open-compass/MixtralKit
A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
Last synced: 12 Apr 2025
https://github.com/sail-sg/Adan
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
adan artificial-intelligence bert-model convnext cuda-programming deep-learning diffusion dreamfusion fairseq gpt2 llm-training llms mae moe optimizer pytorch resnet timm transformer-xl vit
Last synced: 05 Apr 2025
https://github.com/ymcui/chinese-mixtral
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
32k 64k large-language-models llm mixtral mixture-of-experts moe nlp
Last synced: 04 Apr 2025
https://github.com/ymcui/Chinese-Mixtral
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
32k 64k large-language-models llm mixtral mixture-of-experts moe nlp
Last synced: 24 Mar 2025
https://github.com/mindspore-courses/step_into_llm
MindSpore online courses: Step into LLM
bert chatglm chatglm2 chatgpt codegeex gpt gpt2 instruction-tuning large-language-models llama llama2 llm mindspore moe natural-language-processing nlp parallel-computing peft prompt-tuning rlhf
Last synced: 15 May 2025
https://github.com/inferflow/inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
baichuan2 bloom deepseek falcon gemma internlm llama2 llamacpp llm-inference m2m100 minicpm mistral mixtral mixture-of-experts model-quantization moe multi-gpu-inference phi-2 qwen
Last synced: 07 Apr 2025
https://github.com/skyworkai/moh
MoH: Multi-Head Attention as Mixture-of-Head Attention
attention dit llms mixture-of-experts moe transformer vit
Last synced: 04 Apr 2025
https://github.com/skyworkai/moe-plus-plus
[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
large-language-models llms mixture-of-experts moe
Last synced: 21 Jun 2025
https://github.com/libgdx/gdx-pay
A libGDX cross-platform API for InApp purchasing.
android gdx-pay iap in-app-purchase ios java libgdx moe multi-os-engine robovm
Last synced: 01 Apr 2025
https://github.com/kyegomez/switchtransformers
Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"
ai gpt4 llama mixture-model mixture-of-experts mixture-of-models ml moe multi-modal
Last synced: 09 Oct 2025
https://github.com/shalldie/chuncai
A lovely Page Wizard, is responsible for selling moe.
Last synced: 16 Mar 2025
https://github.com/kyegomez/moe-mamba
Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta
ai ml moe multi-modal-fusion multi-modality swarms
Last synced: 07 May 2025
https://github.com/simplifine-llm/Simplifine
🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨
ai cloud fine-tuning fine-tuning-llm finetuning-llms gpt instruction-tuning large-language-models llama llama3 llm llm-training lora mistral moe open-source peft phi qwen
Last synced: 29 Jul 2025
https://github.com/Simplifine-gamedev/Simplifine
🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨
ai cloud fine-tuning fine-tuning-llm finetuning-llms gpt instruction-tuning large-language-models llama llama3 llm llm-training lora mistral moe open-source peft phi qwen
Last synced: 31 Oct 2025
https://github.com/opensparsellms/llama-moe-v2
🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
attention fine-tuning instruction-tuning llama llama3 mixture-of-experts moe sft sparsity
Last synced: 11 Aug 2025
https://github.com/xrsrke/pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
3d-parallelism data-parallelism distributed-optimizers huggingface-transformers large-scale-language-modeling megatron megatron-lm mixture-of-experts model-parallelism moe pipeline-parallelism sequence-parallelism tensor-parallelism transformers zero-1
Last synced: 10 Jul 2025
https://github.com/LINs-lab/DynMoE
[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
adaptive-computation language-model mixture-of-experts moe multimodal-large-language-models
Last synced: 11 May 2025
https://github.com/vita-group/random-moe-as-dropout
[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang
dropout moe self-slimmable transformer
Last synced: 23 Aug 2025
https://github.com/facebookresearch/adatt
pytorch open-source library for the paper "AdaTT Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations"
Last synced: 08 Apr 2025
https://github.com/cocowy1/smoe-stereo
[ICCV 2025] 🌟🌟🌟 Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts
Last synced: 23 Jul 2025
https://github.com/harry-chen/infmoe
Inference framework for MoE layers based on TensorRT with Python binding
Last synced: 27 Aug 2025
https://github.com/kyegomez/limoe
Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts"
ai artificial-intelligence machine-learning mixture-of-experts ml moe pytorch swarms tensorflow
Last synced: 14 Jul 2025
https://github.com/james-oldfield/mumoe
[NeurIPS'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization
Last synced: 12 Apr 2025
https://github.com/haxpor/blockbunny
Libgdx-based game for Android, iOS, and PC following the tutorial by ForeignGuyMike on youtube channel. Read more on README.md
controller game kotlin kt libgdx mobile moe pc platformer
Last synced: 03 Jul 2025
https://github.com/moefe/moeui
UI components Library with Vue.js (Moe is Justice!!!)
Last synced: 03 May 2025
https://github.com/moebits/moepictures
Moepictures is an anime image board organized by tags.
anime art booru cute image-board kawaii moe
Last synced: 26 Aug 2025
https://github.com/Moebits/Moepictures
Moepictures is an anime image board organized by tags.
anime art booru cute image-board kawaii moe
Last synced: 08 Apr 2025
https://github.com/kyegomez/mhmoe
Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch
ai artificial-intelligence attention chicken machine-learning ml moe transformers
Last synced: 07 May 2025
https://github.com/kravetsone/enkanetwork
Node JS enka.network API wrapper written on TypeScript which provides localization, caching and convenience.
enka enkanetwork genshinapi genshinimpact gensinimpactapi moe nodejs shinshin typescript
Last synced: 11 Apr 2025
https://github.com/1834423612/moe-counter-php
萌萌的网页访客计数器 PHP + Mysql版
badge counter moe php vistor-counter
Last synced: 06 Apr 2025
https://github.com/opensparsellms/clip-moe
CLIP-MoE: Mixture of Experts for CLIP
clip lvlm mixture-of-experts moe openai-clip
Last synced: 15 Aug 2025
https://github.com/fuwn/mayu
⭐ Moe-Counter Compatible Website Hit Counter Written in Gleam
counter functional gleam moe moe-counter website
Last synced: 23 Apr 2025
https://github.com/sefinek/moecounter.js
The most effective and efficient moecounters for your projects, designed to display a wide range of statistics for your website and more!
anime anime-counter counter cute cute-cat moe moecounter neko
Last synced: 15 Sep 2025
https://github.com/kingdido999/yandere-girl-bot
A telegram bot for yande.re girls.
bot girls javascript moe telegram-bot
Last synced: 13 Apr 2025
https://github.com/agora-lab-ai/hydranet
HydraNet is a state-of-the-art transformer architecture that combines Multi-Query Attention (MQA), Mixture of Experts (MoE), and continuous learning capabilities.
agora agoralabs attention attn lfms liquid-models moe transformers
Last synced: 14 Apr 2025
https://github.com/louisbrulenaudet/mergekit
Tools for merging pretrained Large Language Models and create Mixture of Experts (MoE) from open-source models.
dare-ties huggingface large-language-models leaderboard llm merge-llm mergekit mixture-of-experts moe slerp ties transformer
Last synced: 14 Jul 2025
https://github.com/scale-snu/layered-prefill
Layered prefill changes the scheduling axis from tokens to layers and removes redundant MoE weight reloads while keeping decode stall free. The result is lower TTFT, lower end-to-end latency, and lower energy per token without hurting TBT stability.
inference llm llm-infernece llm-serving moe vllm
Last synced: 22 Nov 2025
https://github.com/cvyl/short.moe
Short.moe is a free URL shortener service that allows you to easily shorten long URLs into shorter, more manageable links.
moe public-service shortener shortener-url url-shortener
Last synced: 11 Apr 2025
https://github.com/calpa/atom-kancolle
Notification using fleet girls' voice.
atom atom-package kancolle moe
Last synced: 11 Jul 2025
https://github.com/louisbrulenaudet/mergekit-assistant
Mergekit Assistant is a cutting-edge toolkit designed for the seamless merging of pre-trained language models. It supports an array of models, offers various merging methods, and optimizes for low-resource environments with both CPU and GPU compatibility.
ai chat dare-ties genai hugging-chat hugging-face llm merge mergekit moe slerp ties
Last synced: 29 Aug 2025
https://github.com/nanowell/ai-mix-of-experts-softwareengineering-automation
This collaborative framework is designed to harness the power of a Mixture of Experts (MoE) to automate a wide range of software engineering tasks, thereby enhancing code quality and expediting development processes.
ai automation gpt mistral mixture-of-experts moe
Last synced: 31 Jul 2025
https://github.com/naidezhujimo/yinghub-v2-a-sparse-moe-language-model
YingHub-v2 is an advanced language model built upon the Sparse Mixture of Experts (MoE) architecture. It leverages dynamic routing mechanisms, expert load balancing.incorporating state-of-the-art training and optimization strategies.
Last synced: 28 Mar 2025
https://github.com/the-swarm-corporation/clustermoe
A novel neural network architecture that extends Mixture of Experts (MoE) with hierarchical expert clustering, dynamic tree-based routing, and advanced reliability tracking for improved scalability, specialization, and robustness.
ai attention llms moe pytorch pytorch-models transformers
Last synced: 27 Jul 2025
https://github.com/thc1006/youngfly
111年教育部青年署 Young飛全球行動計畫團隊 -語您童行Tai-Gi | 開源資料
moe open-source opensource pdf pptx pptx-files taigi youngfly youngflyaction
Last synced: 03 Oct 2025
https://github.com/naidezhujimo/sparse-moe-language-model-v1
This repository contains an implementation of a Sparse Mixture of Experts (MoE) Language Model using PyTorch. The model is designed to handle large-scale text generation tasks efficiently by leveraging multiple expert networks and a routing mechanism to dynamically select the most relevant experts for each input.
Last synced: 25 Oct 2025
https://github.com/rabiloo/llm-finetuning
Sample for Fine-Tuning LLMs & VLMs
fine-tuning grpo large-language-models llama-factory llama3 llm lora moe perf qlora qwen rlhf transformers verl
Last synced: 03 Apr 2025
https://github.com/cheapnightbot/moe
Only Discord Bot You Need (i guess...) ~ 萌え萌えキュン ♡ (⸝⸝> ᴗ•⸝⸝)
discord discord-bot discord-py moe python3
Last synced: 09 Apr 2025
https://github.com/amdjadouxx/mini_moe_demo
experimentation of the MOE concept and clear visualisation of what'ss going on
artificial-neural-networks ia mixture-of-experts moe
Last synced: 29 Aug 2025
https://github.com/lethanhvinh0604/Guess-Moe-Number
Game đơn giản với JavaScript
Last synced: 23 Aug 2025
https://github.com/kyokenn/moe-theme-flavour-matorico.el
Emacs - Moe theme (dark) - Matorico flavour
Last synced: 03 Sep 2025
https://github.com/fareedkhan-dev/train-llama4
Building LLaMA 4 MoE from Scratch
llama4 llm meta moe openai python transformer
Last synced: 03 Aug 2025
https://github.com/vinsmokesomya/mixture-of-idiotic-experts
🧠✍️🎭 Mixture of Idiotic Experts: A PyTorch-based Sparse Mixture of Experts (MoE) model for generating Shakespeare-like text, character by character. Inspired by Andrej Karpathy's makemore.
character-level-lm deep-learning generative-ai mixture-of-experts moe nlp python pytorch
Last synced: 07 Jul 2025
https://github.com/nimblehq/nimbl3-moe
A demo application using Multi-OS Engine (MOE)
android cross ios moe multi-os-engine platform
Last synced: 11 Nov 2025
https://github.com/simosebak/japanese-vocabulary-tracker
📚 Track and manage your Japanese vocabulary effortlessly across JLPT levels with this intuitive web-based application.
anki furigana ichi ichimoe japanese japanese-language japanese-vocabulary-tracker javascript jlpt-n3 jlpt-n5 jmdict language-learning language-learning-tool manga moe progress-tracking visual-novel vocabulary-learning
Last synced: 10 Oct 2025
https://github.com/lethanhvinh0604/guess-moe-number
Game đơn giản với JavaScript
Last synced: 14 Oct 2025
https://github.com/voyager466920/koraptor
🚀 150M Language Model with Latent MoE architecture using a SINGLE GPU from scratch.
mixture-of-experts moe small-language-model
Last synced: 19 Oct 2025