Meta has launched MobileLLM-R1, a household of light-weight edge reasoning fashions now out there on Hugging Face. The discharge consists of fashions starting from 140M to 950M parameters, with a deal with environment friendly mathematical, coding, and scientific reasoning at sub-billion scale.
Not like general-purpose chat fashions, MobileLLM-R1 is designed for edge deployment, aiming to ship state-of-the-art reasoning accuracy whereas remaining computationally environment friendly.


What structure powers MobileLLM-R1?
The most important mannequin, MobileLLM-R1-950M, integrates a number of architectural optimizations:
- 22 Transformer layers with 24 consideration heads and 6 grouped KV heads.
- Embedding dimension: 1536; hidden dimension: 6144.
- Grouped-Question Consideration (GQA) reduces compute and reminiscence.
- Block-wise weight sharing cuts parameter rely with out heavy latency penalties.
- SwiGLU activations enhance small-model illustration.
- Context size: 4K for base, 32K for post-trained fashions.
- 128K vocabulary with shared enter/output embeddings.
The emphasis is on decreasing compute and reminiscence necessities, making it appropriate for deployment on constrained gadgets.
How environment friendly is the coaching?
MobileLLM-R1 is notable for knowledge effectivity:
- Educated on ~4.2T tokens in whole.
- By comparability, Qwen3’s 0.6B mannequin was skilled on 36T tokens.
- This implies MobileLLM-R1 makes use of solely ≈11.7% of the info to succeed in or surpass Qwen3’s accuracy.
- Publish-training applies supervised fine-tuning on math, coding, and reasoning datasets.
This effectivity interprets immediately into decrease coaching prices and useful resource calls for.
How does it carry out in opposition to different open fashions?
On benchmarks, MobileLLM-R1-950M reveals important features:
- MATH (MATH500 dataset): ~5× larger accuracy than Olmo-1.24B and ~2× larger accuracy than SmolLM2-1.7B.
- Reasoning and coding (GSM8K, AIME, LiveCodeBench): Matches or surpasses Qwen3-0.6B, regardless of utilizing far fewer tokens.
The mannequin delivers outcomes sometimes related to bigger architectures whereas sustaining a smaller footprint.
The place does MobileLLM-R1 fall quick?
The mannequin’s focus creates limitations:
- Robust in math, code, and structured reasoning.
- Weaker in common dialog, commonsense, and inventive duties in comparison with bigger LLMs.
- Distributed underneath FAIR NC (non-commercial) license, which restricts utilization in manufacturing settings.
- Longer contexts (32K) elevate KV-cache and reminiscence calls for at inference.
How does MobileLLM-R1 evaluate to Qwen3, SmolLM2, and OLMo?
Efficiency snapshot (post-trained fashions):
Mannequin | Params | Prepare tokens (T) | MATH500 | GSM8K | AIME’24 | AIME’25 | LiveCodeBench |
---|---|---|---|---|---|---|---|
MobileLLM-R1-950M | 0.949B | 4.2 | 74.0 | 67.5 | 15.5 | 16.3 | 19.9 |
Qwen3-0.6B | 0.596B | 36.0 | 73.0 | 79.2 | 11.3 | 17.0 | 14.9 |
SmolLM2-1.7B-Instruct | 1.71B | ~11.0 | 19.2 | 41.8 | 0.3 | 0.1 | 4.4 |
OLMo-2-1B-Instruct | 1.48B | ~3.95 | 19.2 | 69.7 | 0.6 | 0.1 | 0.0 |
Key observations:
- R1-950M matches Qwen3-0.6B in math (74.0 vs 73.0) whereas requiring ~8.6× fewer tokens.
- Efficiency gaps vs SmolLM2 and OLMo are substantial throughout reasoning duties.
- Qwen3 maintains an edge in GSM8K, however the distinction is small in comparison with the coaching effectivity benefit.
Abstract
Meta’s MobileLLM-R1 underscores a development towards smaller, domain-optimized fashions that ship aggressive reasoning with out huge coaching budgets. By reaching 2×–5× efficiency features over bigger open fashions whereas coaching on a fraction of the info, it demonstrates that effectivity—not simply scale—will outline the following section of LLM deployment, particularly for math, coding, and scientific use circumstances on edge gadgets.
Take a look at the Model on Hugging Face. Be at liberty to take a look at our GitHub Page for Tutorials, Codes and Notebooks. Additionally, be at liberty to comply with us on Twitter and don’t overlook to affix our 100k+ ML SubReddit and Subscribe to our Newsletter.

Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its reputation amongst audiences.