A Library for comprehensive benchmarking Mixture of Experts in Large Language Models

[Submitted on 1 Nov 2024 (v1), last revised 10 Feb 2026 (this version, v4)]

View a PDF of the paper titled LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models, by Nam V. Nguyen and 4 other authors

View PDF
HTML (experimental)

Abstract:Mixture of experts (MoE) architectures have become a cornerstone for scaling up and are a key component in most large language models such as GPT-OSS, DeepSeek-V3, Llama-4, and Gemini-2.5. However, systematic research on MoE remains severely constrained by the prohibitive computational costs of training and evaluation, restricting large-scale studies accessible to most researchers. We introduce LibMoE, a unified framework for reproducible, efficient, and extensible MoE research that supports both pretraining and sparse-upcycling regimes. Beyond unified implementations, the framework provides transparent analytical tools for probing routing and expert dynamics. Leveraging this foundation, we conduct a comprehensive analysis along three dimensions: (i) routing dynamics, covering expert selection patterns, routing stability and optimality, and how routing entropy reveals task specialization and expert diversity; (ii) the effect of lightweight initialization on load balancing, demonstrating how subtle changes in router initialization shape early expert utilization; and (iii) training regime differences, revealing how sparse upcycling and full pretraining exhibit distinct routing patterns and stability profiles. By lowering the barrier to entry and standardizing evaluation, along with our comprehensive analysis, LibMoE broadens access to MoE research and establishes a reliable benchmark to guide future innovations. GitHub: \href{this https URL}{this https URL}.

Submission history

From: Nam Nguyen [view email]
[v1]
Fri, 1 Nov 2024 14:04:36 UTC (927 KB)
[v2]
Fri, 31 Oct 2025 08:05:58 UTC (1,034 KB)
[v3]
Thu, 5 Feb 2026 10:16:56 UTC (1,025 KB)
[v4]
Tue, 10 Feb 2026 18:09:04 UTC (1,025 KB)

What's Hot

At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

Here’s how I turned a Raspberry Pi into an in-car media server

Beloved SF cat’s death fuels Waymo criticism

A Library for comprehensive benchmarking Mixture of Experts in Large Language Models

Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration

Hallucinations in LLMs Are Not a Bug in the Data

Visual Generalization in Reinforcement Learning via Dynamic Object Tokens

How to Build a Production-Ready Claude Code Skill

Interactive Robot Skill Adaptation using Natural Language

Bayesian Thinking for People Who Hated Statistics

At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

Here’s how I turned a Raspberry Pi into an in-car media server

Beloved SF cat’s death fuels Waymo criticism

3 CMS Platforms Control 73% Of The Market & Shape Technical SEO Defaults

Top 7 Traackr Alternatives 2026

Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration

Get threat intelligence to your team fast, in the tools they already use

Google tests “Sponsored Shops” blocks in Shopping results

AI Search Barely Cites Syndicated News Or Press Releases

Most Popular

13 Trending Songs on TikTok in Nov 2025 (+ How to Use Them)

How to watch the 2026 GRAMMY Awards online from anywhere

Corporate Reputation Management Strategies | Sprout Social

Our Picks

At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

Here’s how I turned a Raspberry Pi into an in-car media server

Beloved SF cat’s death fuels Waymo criticism

Subscribe to Updates

What's Hot

A Library for comprehensive benchmarking Mixture of Experts in Large Language Models

Submission history

Related Posts

Subscribe to Updates