Meta launches Llama 4 Scout and Llama 4 Maverick, their first multimodal AI models with mixture-of-experts architecture, offering superior performance to GPT-4o and Gemini in various benchmarks while anticipating Llama 4 Behemoth, their 2 trillion parameter model.
Meta has announced the launch of Llama 4, a new generation of artificial intelligence models that marks the beginning of a new era for the Llama ecosystem. The first two available models, Llama 4 Scout and Llama 4 Maverick, are the first open-source multimodal models with a mixture-of-experts (MoE) architecture and offer unprecedented capabilities in text and image comprehension.
Llama 4 Scout, with 17 billion active parameters and 16 experts, positions itself as the best multimodal model in its class, outperforming Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 across a wide range of benchmarks. Its most notable feature is a context window of 10 million tokens, the largest in the industry, allowing it to process and reason about extensive documents, complete codebases, or multiple information sources.
Meanwhile, Llama 4 Maverick, also with 17 billion active parameters but with 128 experts, outperforms GPT-4o and Gemini 2.0 Flash in multiple benchmark evaluations, achieving comparable results to DeepSeek v3 in reasoning and programming, but with less than half the active parameters. On the LMArena platform, Maverick's experimental chat version has achieved an ELO score of 1417.
Meta has also revealed information about Llama 4 Behemoth, a model with 288 billion active parameters with 16 experts and nearly two trillion total parameters, which has served as a "teacher" for the smaller models. According to the company, Behemoth outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro in various science, technology, engineering, and mathematics benchmarks, though it is still in the training phase.
The MoE architecture used in these models allows for greater computational efficiency, as each token activates only a fraction of the total parameters. For example, Llama 4 Maverick has 400 billion total parameters, but only uses 17 billion during inference, significantly reducing service costs and latency.
A key innovation in these models is their native multimodality, incorporating early fusion to integrate text and image tokens into a unified structure. This has enabled joint pre-training with large amounts of unlabeled text, image, and video data, improving visual comprehension and cross-modal reasoning capabilities.
The new models are already available for download at llama.com and Hugging Face, allowing developers and companies to incorporate these advanced capabilities into their applications. Additionally, Meta has integrated Llama 4 into Meta AI, available on WhatsApp, Messenger, Instagram Direct, and the Meta.AI website.
Meta's AI research initiative developing projects in natural language processing, generative AI, vision and human-computer interaction. Creators of the open-source Llama ...
12/06/2026
The United States government has ordered Anthropic to block access to Claude Fable 5 and Mythos 5 for foreign nationals, forcing the company to ...
09/06/2026
Anthropic introduces Claude Fable 5 and Claude Mythos 5, two versions of its most capable model to date. They share the same foundation, but one is ...
02/06/2026
Microsoft expands its artificial intelligence portfolio with seven models developed entirely by its MAI team, covering image generation, ...
25/05/2026
Pope Leo XIV publishes the first encyclical dedicated to artificial intelligence, setting human dignity as the criterion for all technological ...