Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, Phi4, ...) (AAAI 2025).

Updated Jun 18, 2026
Python

NVIDIA / TensorRT-LLM

Star

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

cuda pytorch moe blackwell llm-serving

Updated Jun 18, 2026
Python

flashinfer-ai / flashinfer

Star

FlashInfer: Kernel Library for LLM Serving

gpu cuda jit pytorch nvidia moe attention llm-inference large-large-models distributed-inference

Updated Jun 18, 2026
Python

czy0729 / Bangumi

Star

An unofficial https://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录，bgm.tv 第三方客户端。为移动端重新设计，内置大量加强的网页端难以实现的功能，且提供了相当的自定义选项。目前已适配 iOS / Android。

react android ios design react-native mobx ios-app moe bangumi android-app expo

Updated Jun 17, 2026
TypeScript

zai-org / GLM-4.5

Star

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

agent moe glm llm reasoning-language-models

Updated Feb 1, 2026
Python

PKU-YuanGroup / MoE-LLaVA

Star

【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models

moe multi-modal mixture-of-experts large-vision-language-model

Updated Jul 15, 2025
Python

MoonshotAI / MoBA

Star

MoBA: Mixture of Block Attention for Long-Context LLMs

pytorch transformer moe llm llm-serving llm-training flash-attention

Updated Apr 3, 2025
Python

uccl-project / uccl

Star

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

ai networking hpc amd gpu collective cuda p2p nvidia broadcom moe rdma allreduce llm kvcache

Updated Jun 15, 2026
C++

davidmrau / mixture-of-experts

Star

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

pytorch moe re-implementation mixture-of-experts sparsely-gated-mixture-of-experts

Updated Apr 19, 2024
Python

pjlab-sys4nlp / llama-moe

Star

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

moe llama mixture-of-experts llm continual-pre-training expert-partition

Updated Dec 6, 2024
Python

microsoft / Tutel

Star

Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4

pytorch moe mixture-of-experts llm deepseek

Updated Jun 18, 2026
C

NVIDIA / cudnn-frontend

Star

cuDNN Frontend is NVIDIA's modern, open-source entry point to the cuDNN library and a growing collection of high-performance open-source kernels.

Updated Jun 17, 2026
Python

sail-sg / Adan

Star

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

Updated Jun 8, 2025
Python

ScienceOne-AI / DeepSeek-671B-SFT-Guide

Star

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions. (DeepSeek-V3/R1 满血版 671B 全参数微调的开源解决方案，包含从训练到推理的完整代码和脚本，以及实践中积累一些经验和结论。)

python moe sft llm deepseek-r1

Updated Mar 13, 2025
Python

open-compass / MixtralKit

Star

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

moe mistral llm

Updated Dec 15, 2023
Python

SharpAI / SwiftLM

Sponsor

Star

⚡ Native MLX Swift LLM inference server for Apple Silicon. OpenAI-compatible API, SSD streaming for 100B+ MoE models, TurboQuant KV cache compression, MACOS + iOS iPhone app.

swift ios metal inference moe mlx on-device-ai openai-api llm apple-sili

Updated May 19, 2026
Swift

ymcui / Chinese-Mixtral

Star

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

nlp moe 64k mixture-of-experts 32k large-language-models llm mixtral

Updated Apr 19, 2026
Python

Improve this page

Add a description, image, and links to the moe topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the moe topic, visit your repo's landing page and select "manage topics."

Learn more

CS Knowledge Base

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

moe

Here are 418 public repositories matching this topic...

vllm-project / vllm

hiyouga / LlamaFactory

sgl-project / sglang

modelscope / ms-swift

NVIDIA / TensorRT-LLM

flashinfer-ai / flashinfer

czy0729 / Bangumi

zai-org / GLM-4.5

PKU-YuanGroup / MoE-LLaVA

MoonshotAI / MoBA

uccl-project / uccl

davidmrau / mixture-of-experts

pjlab-sys4nlp / llama-moe

microsoft / Tutel

NVIDIA / cudnn-frontend

sail-sg / Adan

ScienceOne-AI / DeepSeek-671B-SFT-Guide

open-compass / MixtralKit

SharpAI / SwiftLM

ymcui / Chinese-Mixtral

Improve this page

Add this topic to your repo