Meta launches Llama 4 Scout and Llama 4 Maverick, their first multimodal AI models with mixture-of-experts architecture, offering superior performance to GPT-4o and Gemini in various benchmarks while anticipating Llama 4 Behemoth, their 2 trillion parameter model.
Meta has announced the launch of Llama 4, a new generation of artificial intelligence models that marks the beginning of a new era for the Llama ecosystem. The first two available models, Llama 4 Scout and Llama 4 Maverick, are the first open-source multimodal models with a mixture-of-experts (MoE) architecture and offer unprecedented capabilities in text and image comprehension.
Llama 4 Scout, with 17 billion active parameters and 16 experts, positions itself as the best multimodal model in its class, outperforming Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 across a wide range of benchmarks. Its most notable feature is a context window of 10 million tokens, the largest in the industry, allowing it to process and reason about extensive documents, complete codebases, or multiple information sources.
Meanwhile, Llama 4 Maverick, also with 17 billion active parameters but with 128 experts, outperforms GPT-4o and Gemini 2.0 Flash in multiple benchmark evaluations, achieving comparable results to DeepSeek v3 in reasoning and programming, but with less than half the active parameters. On the LMArena platform, Maverick's experimental chat version has achieved an ELO score of 1417.
Meta has also revealed information about Llama 4 Behemoth, a model with 288 billion active parameters with 16 experts and nearly two trillion total parameters, which has served as a "teacher" for the smaller models. According to the company, Behemoth outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro in various science, technology, engineering, and mathematics benchmarks, though it is still in the training phase.
The MoE architecture used in these models allows for greater computational efficiency, as each token activates only a fraction of the total parameters. For example, Llama 4 Maverick has 400 billion total parameters, but only uses 17 billion during inference, significantly reducing service costs and latency.
A key innovation in these models is their native multimodality, incorporating early fusion to integrate text and image tokens into a unified structure. This has enabled joint pre-training with large amounts of unlabeled text, image, and video data, improving visual comprehension and cross-modal reasoning capabilities.
The new models are already available for download at llama.com and Hugging Face, allowing developers and companies to incorporate these advanced capabilities into their applications. Additionally, Meta has integrated Llama 4 into Meta AI, available on WhatsApp, Messenger, Instagram Direct, and the Meta.AI website.
Meta's AI research initiative developing projects in natural language processing, generative AI, vision and human-computer interaction. Creators of the open-source Llama ...
17/02/2026
Meta and NVIDIA have announced a multi-year strategic partnership for the large-scale deployment of chips and networking in Meta's data centers, with ...
11/02/2026
Zoë Hitzig, who spent two years at OpenAI shaping AI models and safety policies, has resigned following the company's announcement to test ads on ...
05/02/2026
Kuaishou Technology has introduced Kling AI 3.0, which includes four new video and image generation models with significant improvements in visual ...
05/02/2026
OpenAI has introduced Frontier, a platform designed to enable businesses to build, deploy, and manage artificial intelligence agents that integrate ...