IBM has introduced Granite 4.0, language models designed for enterprise environments that combine Transformer and Mamba-2 architectures. The company claims they reduce memory consumption by up to 70%. They are the first open source models with ISO 42001 certification.
IBM has announced the launch of Granite 4.0, a family of large language models incorporating a hybrid architecture designed to reduce computational resource consumption in enterprise environments. The new models combine Transformer architecture layers with Mamba-2 layers in a 9:1 ratio, a configuration that according to IBM allows processing long contexts with lower RAM usage. The Tiny and Small models also include mixture of experts (MoE) blocks with shared experts that improve parameter efficiency.
The company has introduced three initial variants: Micro, Tiny and Small. Each is available in Base and Instruct versions, designed for different enterprise use cases and corporate deployments. IBM plans to release additional versions, including larger (Medium) and smaller (Nano) models, before the end of 2025.
One of the standout aspects of this generation is the ISO 42001 certification obtained by the Granite family, becoming the first open source language models to achieve this accreditation. The ISO 42001 standard evaluates artificial intelligence management systems in aspects such as data privacy, explainability and accountability.
Granite 4.0 models have been trained with a corpus of 22 trillion tokens from curated enterprise sources. The hybrid architecture allows memory requirements to remain constant regardless of context length, while in conventional Transformer models these requirements grow quadratically. This facilitates processing extensive documents or long conversations without proportionally increasing necessary resources.
In terms of performance, Granite 4.0-H-Small achieves competitive results in benchmarks such as IFEval, which evaluates instruction-following capability, and Berkeley Function Calling Leaderboard v3, which measures precision in function call execution. IBM has worked with companies like EY and Lockheed Martin to validate these models' performance in real use cases.
The company also offers unlimited indemnification for intellectual property claims related to content generated by Granite models when used in watsonx.ai.
The models are available on IBM watsonx.ai and open source platforms like Hugging Face, Ollama, NVIDIA NIM and Replicate. IBM has established collaborations with hardware manufacturers like Qualcomm and AMD to optimize performance across different device types, from servers to mobile equipment.
Suite of generative artificial intelligence products that integrates development, management and automation. Enables management of foundation or custom AI models, automation of business processes and ...
05/02/2026
Kuaishou Technology has introduced Kling AI 3.0, which includes four new video and image generation models with significant improvements in visual ...
05/02/2026
OpenAI has introduced Frontier, a platform designed to enable businesses to build, deploy, and manage artificial intelligence agents that integrate ...
02/02/2026
SpaceX has acquired xAI to create an integrated system of artificial intelligence and space technology. The company has announced orbital data ...
30/01/2026
Anthropic has announced the availability of plugins in Cowork, its task automation tool that allows non-technical users to leverage Claude Code ...