H Company launches Surfer 2, an agent designed to execute tasks in desktop, web and mobile environments. The system achieves the best recorded results in four benchmark tests that evaluate control and navigation capabilities across digital platforms.
H Company has announced Surfer 2, a computer-use agent capable of operating across multiple digital platforms through visual and tactile interaction. The system achieves record results in four benchmark tests that evaluate the ability of artificial intelligence agents to control computers, navigate the web and manage mobile devices.
Surfer 2's architecture separates strategic planning from tactical execution through a configurable orchestrator module that breaks down complex tasks into subtasks assigned to specialized sub-agents. Each sub-agent reports results to the orchestrator, which determines the next step or replans the strategy in case of failure. The system can operate with or without this module depending on task complexity, and includes components dedicated to visual perception, task validation and failure recovery to ensure consistency across different environments.
In OSWorld, a test measuring the ability to control an Ubuntu desktop environment, Surfer 2 achieves 60.1% success on the first attempt in the category that allows only visual perception and interaction. With ten attempts, the system reaches 77%, surpassing the human baseline of 72.4%. In WebArena, which evaluates agents in simulated web environments including e-commerce, social forums and content management platforms, it achieves 69.6% success.
In WebVoyager, a test of information retrieval on live websites, Surfer 2 achieves 97.1% accuracy, improving on the previous record of 93.9%. In AndroidWorld, which measures the ability to control Android devices and use 20 real applications, it achieves 87.1% success through vision and tactile interaction, also surpassing the human baseline of 80%.
H Company states that Surfer 2's results come from combining external foundation models with its own agent training methods and infrastructure. The company indicates that Surfer 2 executions have high costs and is now working on Holo2, its next proprietary model designed to deliver similar performance at reduced costs. It will soon publish a comprehensive technical report on Surfer 2's performance and evaluations.
Development platform for artificial intelligence agents that automates complex web tasks. Offers Runner H, a web agent capable of understanding natural language instructions, dynamically adapting to ...
17/04/2026
Anthropic has launched Claude Design, a tool that enables users to create visual designs, interactive prototypes and presentations through ...
17/04/2026
Anthropic publishes Claude Opus 4.7, a model with notable gains in software development tasks, higher image resolution and new cybersecurity ...
08/04/2026
Meta Superintelligence Labs launches Muse Spark, a multimodal artificial intelligence model capable of processing text and images simultaneously, ...
07/04/2026
Anthropic has launched Project Glasswing, a cybersecurity initiative with twelve major technology companies to use its new AI model, Claude Mythos ...