Xiaomi introduces MiMo-7B, an open-source language model that, with only 7 billion parameters, outperforms larger models in complex mathematical reasoning and programming tasks.
The Xiaomi LLM-Core team has developed MiMo-7B, a model designed to solve complex reasoning problems. This approach contrasts with the current trend toward increasingly larger models, showing that an efficient architecture with well-selected data can achieve good results with fewer resources.
MiMo-7B's training strategy is divided into two phases. The pretraining used 25 trillion tokens focused on content with logical and mathematical structures, such as technical texts and academic books. The team implemented a three-stage data mixing system to increase the density of reasoning patterns.
In the post-training phase, the model was fine-tuned using reinforcement learning techniques with 130,000 mathematics and programming problems. A reward scheme based on test difficulty was implemented to improve training quality.
In evaluations, MiMo-7B achieved notable results: in code generation, it outperformed OpenAI's o1-mini with 57.8% on LiveCodeBench v5 and 49.3% on version v6. In mathematical reasoning, it reached 55.4% on AIME 2025, surpassing larger commercial models by more than 4 points.
The model also demonstrates competence in long context comprehension and general language tasks. This combination of specialization and versatility suggests potential applications in education and software development.
Xiaomi has published the model's checkpoints on GitHub as open-source, making it easy for researchers and developers to experiment with the technology.
This development signals an alternative in creating AI models, where efficiency in design and training can compensate for a smaller number of parameters, allowing for significant advances without relying exclusively on large-scale models.
MiMo is an open-source artificial intelligence model developed by Xiaomi that specializes in mathematical reasoning and code generation. It integrates advanced architecture with data optimization to ...
25/05/2026
Pope Leo XIV publishes the first encyclical dedicated to artificial intelligence, setting human dignity as the criterion for all technological ...
11/05/2026
Thinking Machines Lab has published a research preview of TML-Interaction-Small, an interaction model designed to collaborate with the user in real ...
24/04/2026
DeepSeek releases a preview of its V4 family, two open-source models capable of processing up to one million tokens of context and competing with the ...
23/04/2026
OpenAI launches GPT-5.5, a model designed to handle complex tasks autonomously — coding, researching, analyzing data and operating a computer ...