Amazon introduces an artificial intelligence model that allows developers to create autonomous agents capable of interacting with web pages and performing complex tasks without constant supervision, significantly improving the automation of digital processes.
Amazon has introduced Nova Act, an innovative artificial intelligence model specifically designed to execute actions within web browsers. The company has released a preview version of the Nova Act SDK through nova.amazon.com, allowing developers to experiment with this technology and build agents capable of completing complex tasks in digital environments.
Unlike traditional language models that are limited to answering queries or generating text, Nova Act is focused on practical action. This approach represents a significant shift in the concept of "AI agents," transforming them into systems that can operate autonomously in various digital environments.
The new tool allows developers to break down complex workflows into reliable atomic commands, such as searches or checkout processes, adding detailed instructions when necessary. The SDK also facilitates integration with APIs and direct browser manipulation through Playwright, which considerably increases its reliability.
According to Amazon's internal tests, Nova Act has achieved accuracy above 90% in capabilities that are usually problematic for other models, such as date selection or interaction with dropdown menus. In benchmarks like ScreenSpot and GroundUI Web, which measure web interaction capability, Nova Act has demonstrated competitive performance against models like Anthropic's Claude 3.7 Sonnet and OpenAI's CUA.
One of the most notable features of Nova Act is its focus on reliability, allowing agents to function without continuous supervision. Developers can activate "headless" mode, convert their agents into APIs, or schedule them to run asynchronously as needed.
Amazon is already using Nova Act in Alexa+ to navigate the internet autonomously and complete tasks when integrated services do not provide all the necessary APIs. The company has indicated that this launch is only the first step in its vision to develop key capabilities for creating useful agents at scale.
Amazon Nova is a state-of-the-art family of foundation models available on Amazon Bedrock. It provides advanced capabilities for understanding and generating text, images, and videos. Designed for ...
03/06/2025
ElevenLabs has released Eleven v3 (alpha), a text-to-speech model that incorporates emotional control tools and multi-speaker dialogue capabilities ...
29/05/2025
Black Forest Labs introduces FLUX.1 Kontext, a new family of artificial intelligence models that enables image generation and editing using both text ...
22/05/2025
Anthropic presents Claude Opus 4 and Sonnet 4, artificial intelligence models that achieve new records in code evaluations and incorporate extended ...
16/05/2025
Codex is an AI-powered agent that optimizes software development by automating multiple tasks simultaneously. OpenAI has launched a preliminary ...