Kuaishou Technology has introduced Kling AI 3.0, which includes four new video and image generation models with significant improvements in visual consistency, extended duration, and native audio capabilities in multiple languages and accents.
The new Kling AI 3.0 model series includes four models: Video 3.0, Video 3.0 Omni, Image 3.0, and Image 3.0 Omni, representing a significant advance in narrative control and visual coherence.
Video 3.0 incorporates native audio generation in English, Chinese, Japanese, Korean, Spanish, and various accents and dialects. This enables the creation of complex dialogue scenes between multiple characters, each speaking a different language. The maximum video duration extends to 15 seconds, sufficient for elaborate sequences with multiple narrative twists and cinematic transitions.
Among the standout improvements is visual element consistency. Creators can upload reference videos and multiple images to ensure characters, objects, and scenarios maintain coherence across frames. The model understands multi-scene and multi-shot instructions, dynamically adjusting camera angles according to creative direction.
The system also improves text preservation in images, maintaining signage, subtitles, and brand elements with high precision. This capability proves useful in e-commerce advertising, where logos on clothing remain sharp throughout the entire video.
Video 3.0 Omni expands reference capabilities by allowing the AI to extract visual traits and voice characteristics from a character to replicate them in new scenes. It incorporates a multi-shot storyboard feature where users specify duration, framing, perspective, and camera movements for each shot.
The image models Image 3.0 and Image 3.0 Omni support 2K and 4K output for professional use cases, preserving textures, lighting, and material qualities with notable precision.
Since its launch in June 2024, Kling AI has over 60 million creators globally and has produced more than 600 million videos. The models are available in early access for Ultra subscribers and will soon open to the general public.
Artificial intelligence tool that generates videos and images from text and visual references. Includes multilingual audio, visual consistency control and storyboard capabilities for cinematic ...
24/04/2026
DeepSeek releases a preview of its V4 family, two open-source models capable of processing up to one million tokens of context and competing with the ...
23/04/2026
OpenAI launches GPT-5.5, a model designed to handle complex tasks autonomously — coding, researching, analyzing data and operating a computer ...
21/04/2026
OpenAI introduces ChatGPT Images 2.0, an image generation model with greater precision, multilingual support, flexible aspect ratios and, for the ...
17/04/2026
Anthropic has launched Claude Design, a tool that enables users to create visual designs, interactive prototypes and presentations through ...