ElevenLabs introduces Scribe v2, a transcription model that improves accuracy in extensive audio and offers automatic entity detection, multilingual support, and features designed for enterprise workflows.
ElevenLabs has announced the launch of Scribe v2, its new transcription model designed to process batch audio, generate subtitles, and create transcriptions at scale. The model incorporates improvements in stability and accuracy compared to the previous version, with better handling of extensive audio, pauses, tone changes, and prolonged silences.
Scribe v2 is optimized for long and complex recordings, maintaining accuracy across different speakers, accents, and presentation styles. According to company data, the model achieves the lowest word error rate recorded in industry evaluation standards.
Among the highlighted functionalities is keyterm prompting, a system that allows selecting up to 100 specific words or phrases. The model uses context to decide when to transcribe these terms, which is useful in technical domains, brand names, and specialized language.
The model incorporates native entity detection for structured audio analysis. Users can select up to 56 categories including personally identifiable information, health data, or payment information. Scribe v2 automatically detects these instances and records their exact timestamps.
The system supports multilingual workflows automatically, processing files containing multiple languages and detecting each one without manual segmentation. The model offers support for more than 90 languages.
The version includes additional features oriented toward enterprise cases: intelligent speaker identification, word-level timestamps, dynamic audio tagging that detects non-verbal events, and compliance with SOC 2, ISO 27001, PCI DSS L1, HIPAA, and GDPR standards. It also offers data residency in the European Union and India.
Scribe v2 is available in ElevenLabs Studio and through the platform's API, allowing developers and enterprises to automate complex audio processes.
Explore the most advanced text to speech and voice cloning software ever. Create lifelike voiceovers for your content or use our AI voice generator as an easy-to-use text ...
24/04/2026
DeepSeek releases a preview of its V4 family, two open-source models capable of processing up to one million tokens of context and competing with the ...
23/04/2026
OpenAI launches GPT-5.5, a model designed to handle complex tasks autonomously — coding, researching, analyzing data and operating a computer ...
21/04/2026
OpenAI introduces ChatGPT Images 2.0, an image generation model with greater precision, multilingual support, flexible aspect ratios and, for the ...
17/04/2026
Anthropic has launched Claude Design, a tool that enables users to create visual designs, interactive prototypes and presentations through ...