Projects in Awesome Lists tagged with moe

https://github.com/hiyouga/llama-factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

agent ai chatglm fine-tuning gpt instruction-tuning language-model large-language-models llama llama3 llm lora mistral moe peft qlora quantization qwen rlhf transformers

Last synced: 02 Jan 2026

https://github.com/hiyouga/LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

agent ai chatglm fine-tuning gpt instruction-tuning language-model large-language-models llama llama3 llm lora mistral moe peft qlora quantization qwen rlhf transformers

Last synced: 14 Mar 2025

https://github.com/sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

cuda deepseek deepseek-llm deepseek-r1 deepseek-r1-zero deepseek-v3 inference llama llama3 llama3-1 llava llm llm-serving moe pytorch transformer vlm

Last synced: 12 May 2025

https://github.com/czy0729/Bangumi

:electron: An unofficial https://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录，bgm.tv 第三方客户端。为移动端重新设计，内置大量加强的网页端难以实现的功能，且提供了相当的自定义选项。目前已适配 iOS / Android / WSA、mobile / 简单 pad、light / dark theme、移动端网页。

android android-app bangumi design expo ios ios-app mobx moe react react-native

Last synced: 13 Apr 2025

https://github.com/czy0729/bangumi

:electron: An unofficial https://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录，bgm.tv 第三方客户端。为移动端重新设计，内置大量加强的网页端难以实现的功能，且提供了相当的自定义选项。目前已适配 iOS / Android / WSA、mobile / 简单 pad、light / dark theme、移动端网页。

android android-app bangumi design expo ios ios-app mobx moe react react-native

Last synced: 13 May 2025

https://github.com/pku-yuangroup/moe-llava

Mixture-of-Experts for Large Vision-Language Models

large-vision-language-model mixture-of-experts moe multi-modal

Last synced: 14 May 2025

https://github.com/PKU-YuanGroup/MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

large-vision-language-model mixture-of-experts moe multi-modal

Last synced: 16 Mar 2025

https://github.com/moonshotai/moba

MoBA: Mixture of Block Attention for Long-Context LLMs

flash-attention llm llm-serving llm-training moe pytorch transformer

Last synced: 14 May 2025

https://github.com/MoonshotAI/MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

flash-attention llm llm-serving llm-training moe pytorch transformer

Last synced: 31 Mar 2025

https://github.com/pjlab-sys4nlp/llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

continual-pre-training expert-partition llama llm mixture-of-experts moe

Last synced: 09 May 2025

https://github.com/microsoft/tutel

Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4

deepseek llm mixture-of-experts moe pytorch

Last synced: 15 May 2025

https://github.com/sail-sg/adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

adan artificial-intelligence bert-model convnext cuda-programming deep-learning diffusion dreamfusion fairseq gpt2 llm-training llms mae moe optimizer pytorch resnet timm transformer-xl vit

Last synced: 07 Jul 2025

https://github.com/microsoft/Tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation

deepseek deepseek-r1 llm mixture-of-experts moe pytorch

Last synced: 29 Mar 2025

https://github.com/open-compass/mixtralkit

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

llm mistral moe

Last synced: 12 Apr 2025

https://github.com/open-compass/MixtralKit

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

llm mistral moe

Last synced: 12 Apr 2025

https://github.com/sail-sg/Adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

adan artificial-intelligence bert-model convnext cuda-programming deep-learning diffusion dreamfusion fairseq gpt2 llm-training llms mae moe optimizer pytorch resnet timm transformer-xl vit

Last synced: 05 Apr 2025

https://github.com/ymcui/chinese-mixtral

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

32k 64k large-language-models llm mixtral mixture-of-experts moe nlp

Last synced: 04 Apr 2025

https://github.com/ymcui/Chinese-Mixtral

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

32k 64k large-language-models llm mixtral mixture-of-experts moe nlp

Last synced: 24 Mar 2025

https://github.com/mindspore-courses/step_into_llm

MindSpore online courses: Step into LLM

bert chatglm chatglm2 chatgpt codegeex gpt gpt2 instruction-tuning large-language-models llama llama2 llm mindspore moe natural-language-processing nlp parallel-computing peft prompt-tuning rlhf

Last synced: 15 May 2025

https://github.com/kokororin/pixiv.moe

😘 A pinterest-style layout site, shows illusts on pixiv.net order by popularity.

comic comics illust illusts lovelive moe pixiv react redux typescript webapp website

Last synced: 05 Apr 2025

https://github.com/inferflow/inferflow

Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).

baichuan2 bloom deepseek falcon gemma internlm llama2 llamacpp llm-inference m2m100 minicpm mistral mixtral mixture-of-experts model-quantization moe multi-gpu-inference phi-2 qwen

Last synced: 07 Apr 2025

https://github.com/skyworkai/moh

MoH: Multi-Head Attention as Mixture-of-Head Attention

attention dit llms mixture-of-experts moe transformer vit

Last synced: 04 Apr 2025

https://github.com/skyworkai/moe-plus-plus

[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts

large-language-models llms mixture-of-experts moe

Last synced: 21 Jun 2025

https://github.com/libgdx/gdx-pay

A libGDX cross-platform API for InApp purchasing.

android gdx-pay iap in-app-purchase ios java libgdx moe multi-os-engine robovm

Last synced: 01 Apr 2025

https://github.com/kyegomez/switchtransformers

Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"

ai gpt4 llama mixture-model mixture-of-experts mixture-of-models ml moe multi-modal

Last synced: 09 Oct 2025

https://github.com/shalldie/chuncai

A lovely Page Wizard, is responsible for selling moe.

chuncai moe

Last synced: 16 Mar 2025

https://github.com/kyegomez/moe-mamba

Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta

ai ml moe multi-modal-fusion multi-modality swarms

Last synced: 07 May 2025

https://github.com/simplifine-llm/Simplifine

🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨

ai cloud fine-tuning fine-tuning-llm finetuning-llms gpt instruction-tuning large-language-models llama llama3 llm llm-training lora mistral moe open-source peft phi qwen

Last synced: 29 Jul 2025

https://github.com/Simplifine-gamedev/Simplifine

🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨

ai cloud fine-tuning fine-tuning-llm finetuning-llms gpt instruction-tuning large-language-models llama llama3 llm llm-training lora mistral moe open-source peft phi qwen

Last synced: 31 Oct 2025

https://github.com/opensparsellms/llama-moe-v2

🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

attention fine-tuning instruction-tuning llama llama3 mixture-of-experts moe sft sparsity

Last synced: 11 Aug 2025

https://github.com/xrsrke/pipegoose

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

3d-parallelism data-parallelism distributed-optimizers huggingface-transformers large-scale-language-modeling megatron megatron-lm mixture-of-experts model-parallelism moe pipeline-parallelism sequence-parallelism tensor-parallelism transformers zero-1

Last synced: 10 Jul 2025

https://github.com/LINs-lab/DynMoE

[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

adaptive-computation language-model mixture-of-experts moe multimodal-large-language-models

Last synced: 11 May 2025

https://github.com/vita-group/random-moe-as-dropout

[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang

dropout moe self-slimmable transformer

Last synced: 23 Aug 2025

https://github.com/facebookresearch/adatt

pytorch open-source library for the paper "AdaTT Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations"

moe mtl multitask-learning

Last synced: 08 Apr 2025

https://github.com/cocowy1/smoe-stereo

[ICCV 2025] 🌟🌟🌟 Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts

depth iccv moe

Last synced: 23 Jul 2025

https://github.com/harry-chen/infmoe

Inference framework for MoE layers based on TensorRT with Python binding

inference moe tensorrt

Last synced: 27 Aug 2025

https://github.com/kyegomez/limoe

Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts"

ai artificial-intelligence machine-learning mixture-of-experts ml moe pytorch swarms tensorflow

Last synced: 14 Jul 2025

https://github.com/james-oldfield/mumoe

[NeurIPS'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization

mixture-of-experts moe

Last synced: 12 Apr 2025

https://github.com/haxpor/blockbunny

Libgdx-based game for Android, iOS, and PC following the tutorial by ForeignGuyMike on youtube channel. Read more on README.md

controller game kotlin kt libgdx mobile moe pc platformer

Last synced: 03 Jul 2025

https://github.com/moefe/moeui

UI components Library with Vue.js (Moe is Justice!!!)

moe sass vue

Last synced: 03 May 2025

https://github.com/moebits/moepictures

Moepictures is an anime image board organized by tags.

anime art booru cute image-board kawaii moe

Last synced: 26 Aug 2025

https://github.com/Moebits/Moepictures

Moepictures is an anime image board organized by tags.

anime art booru cute image-board kawaii moe

Last synced: 08 Apr 2025

https://github.com/kyegomez/mhmoe

Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch

ai artificial-intelligence attention chicken machine-learning ml moe transformers

Last synced: 07 May 2025

https://github.com/kravetsone/enkanetwork

Node JS enka.network API wrapper written on TypeScript which provides localization, caching and convenience.

enka enkanetwork genshinapi genshinimpact gensinimpactapi moe nodejs shinshin typescript

Last synced: 11 Apr 2025

https://github.com/1834423612/moe-counter-php

萌萌的网页访客计数器 PHP + Mysql版

badge counter moe php vistor-counter

Last synced: 06 Apr 2025

https://github.com/opensparsellms/clip-moe

CLIP-MoE: Mixture of Experts for CLIP

clip lvlm mixture-of-experts moe openai-clip

Last synced: 15 Aug 2025

https://github.com/tobychui/weather-pet-display

A simple weather display with a cute interactive desktop pet (❛◡❛✿)

anime arduino cute display diy esp8266 maker moe pet uart-hmi weather

Last synced: 24 Apr 2025

https://github.com/fuwn/mayu

⭐ Moe-Counter Compatible Website Hit Counter Written in Gleam

counter functional gleam moe moe-counter website

Last synced: 23 Apr 2025

https://github.com/sefinek/moecounter.js

The most effective and efficient moecounters for your projects, designed to display a wide range of statistics for your website and more!

anime anime-counter counter cute cute-cat moe moecounter neko

Last synced: 15 Sep 2025

https://github.com/kingdido999/yandere-girl-bot

A telegram bot for yande.re girls.

bot girls javascript moe telegram-bot

Last synced: 13 Apr 2025

https://github.com/agora-lab-ai/hydranet

HydraNet is a state-of-the-art transformer architecture that combines Multi-Query Attention (MQA), Mixture of Experts (MoE), and continuous learning capabilities.

agora agoralabs attention attn lfms liquid-models moe transformers

Last synced: 14 Apr 2025

https://github.com/louisbrulenaudet/mergekit

Tools for merging pretrained Large Language Models and create Mixture of Experts (MoE) from open-source models.

dare-ties huggingface large-language-models leaderboard llm merge-llm mergekit mixture-of-experts moe slerp ties transformer

Last synced: 14 Jul 2025

https://github.com/scale-snu/layered-prefill

Layered prefill changes the scheduling axis from tokens to layers and removes redundant MoE weight reloads while keeping decode stall free. The result is lower TTFT, lower end-to-end latency, and lower energy per token without hurting TBT stability.

inference llm llm-infernece llm-serving moe vllm

Last synced: 22 Nov 2025

https://github.com/cvyl/short.moe

Short.moe is a free URL shortener service that allows you to easily shorten long URLs into shorter, more manageable links.

moe public-service shortener shortener-url url-shortener

Last synced: 11 Apr 2025

https://github.com/calpa/atom-kancolle

Notification using fleet girls' voice.

atom atom-package kancolle moe

Last synced: 11 Jul 2025

https://github.com/louisbrulenaudet/mergekit-assistant

Mergekit Assistant is a cutting-edge toolkit designed for the seamless merging of pre-trained language models. It supports an array of models, offers various merging methods, and optimizes for low-resource environments with both CPU and GPU compatibility.

ai chat dare-ties genai hugging-chat hugging-face llm merge mergekit moe slerp ties

Last synced: 29 Aug 2025

https://github.com/nanowell/ai-mix-of-experts-softwareengineering-automation

This collaborative framework is designed to harness the power of a Mixture of Experts (MoE) to automate a wide range of software engineering tasks, thereby enhancing code quality and expediting development processes.

ai automation gpt mistral mixture-of-experts moe

Last synced: 31 Jul 2025

https://github.com/naidezhujimo/yinghub-v2-a-sparse-moe-language-model

YingHub-v2 is an advanced language model built upon the Sparse Mixture of Experts (MoE) architecture. It leverages dynamic routing mechanisms, expert load balancing.incorporating state-of-the-art training and optimization strategies.

llm moe nlp pytorch

Last synced: 28 Mar 2025

https://github.com/the-swarm-corporation/clustermoe

A novel neural network architecture that extends Mixture of Experts (MoE) with hierarchical expert clustering, dynamic tree-based routing, and advanced reliability tracking for improved scalability, specialization, and robustness.

ai attention llms moe pytorch pytorch-models transformers

Last synced: 27 Jul 2025

https://github.com/thc1006/youngfly

111年教育部青年署 Young飛全球行動計畫團隊 -語您童行Tai-Gi | 開源資料

moe open-source opensource pdf pptx pptx-files taigi youngfly youngflyaction

Last synced: 03 Oct 2025

https://github.com/naidezhujimo/sparse-moe-language-model-v1

This repository contains an implementation of a Sparse Mixture of Experts (MoE) Language Model using PyTorch. The model is designed to handle large-scale text generation tasks efficiently by leveraging multiple expert networks and a routing mechanism to dynamically select the most relevant experts for each input.

moe nlp pytorch transformer

Last synced: 25 Oct 2025

https://github.com/rabiloo/llm-finetuning

Sample for Fine-Tuning LLMs & VLMs

fine-tuning grpo large-language-models llama-factory llama3 llm lora moe perf qlora qwen rlhf transformers verl

Last synced: 03 Apr 2025

https://github.com/cheapnightbot/moe

Only Discord Bot You Need (i guess...) ~ 萌え萌えキュン ♡ (⸝⸝> ᴗ•⸝⸝)

discord discord-bot discord-py moe python3

Last synced: 09 Apr 2025

https://github.com/amdjadouxx/mini_moe_demo

experimentation of the MOE concept and clear visualisation of what'ss going on

artificial-neural-networks ia mixture-of-experts moe

Last synced: 29 Aug 2025

https://github.com/lethanhvinh0604/Guess-Moe-Number

Game đơn giản với JavaScript

css game html javascript moe

Last synced: 23 Aug 2025

https://github.com/kyokenn/moe-theme-flavour-matorico.el

Emacs - Moe theme (dark) - Matorico flavour

emacs emacs-theme moe

Last synced: 03 Sep 2025

https://github.com/fareedkhan-dev/train-llama4

Building LLaMA 4 MoE from Scratch

llama4 llm meta moe openai python transformer

Last synced: 03 Aug 2025

https://github.com/vinsmokesomya/mixture-of-idiotic-experts

🧠✍️🎭 Mixture of Idiotic Experts: A PyTorch-based Sparse Mixture of Experts (MoE) model for generating Shakespeare-like text, character by character. Inspired by Andrej Karpathy's makemore.

character-level-lm deep-learning generative-ai mixture-of-experts moe nlp python pytorch