ElevenLabs renews its transcription model with Scribe v2

09/01/2026

ElevenLabs introduces Scribe v2, a transcription model that improves accuracy in extensive audio and offers automatic entity detection, multilingual support, and features designed for enterprise workflows.

ElevenLabs renews its transcription model with Scribe v2

ElevenLabs has announced the launch of Scribe v2, its new transcription model designed to process batch audio, generate subtitles, and create transcriptions at scale. The model incorporates improvements in stability and accuracy compared to the previous version, with better handling of extensive audio, pauses, tone changes, and prolonged silences.

Scribe v2 is optimized for long and complex recordings, maintaining accuracy across different speakers, accents, and presentation styles. According to company data, the model achieves the lowest word error rate recorded in industry evaluation standards.

Among the highlighted functionalities is keyterm prompting, a system that allows selecting up to 100 specific words or phrases. The model uses context to decide when to transcribe these terms, which is useful in technical domains, brand names, and specialized language.

The model incorporates native entity detection for structured audio analysis. Users can select up to 56 categories including personally identifiable information, health data, or payment information. Scribe v2 automatically detects these instances and records their exact timestamps.

The system supports multilingual workflows automatically, processing files containing multiple languages and detecting each one without manual segmentation. The model offers support for more than 90 languages.

The version includes additional features oriented toward enterprise cases: intelligent speaker identification, word-level timestamps, dynamic audio tagging that detects non-verbal events, and compliance with SOC 2, ISO 27001, PCI DSS L1, HIPAA, and GDPR standards. It also offers data residency in the European Union and India.

Scribe v2 is available in ElevenLabs Studio and through the platform's API, allowing developers and enterprises to automate complex audio processes.

Key points

  • Scribe v2 achieves the lowest word error rate in industry evaluation standards
  • The keyterm prompting system allows selecting up to 100 specific words or phrases for contextual transcription
  • Incorporates automatic entity detection in 56 categories, including personal, health, and payment data
  • Supports automatic transcription of audio with multiple languages without manual configuration
  • Offers support for more than 90 different languages
  • Includes intelligent speaker identification and word-level timestamps
  • Complies with SOC 2, ISO 27001, PCI DSS L1, HIPAA, and GDPR standards
  • Available in ElevenLabs Studio and through API for enterprise automation

Videos

Related AI

ElevenLabs

Generative Voice AI

Explore the most advanced text to speech and voice cloning software ever. Create lifelike voiceovers for your content or use our AI voice generator as an easy-to-use text ...

Lastest news

Trustpilot
This website uses technical, personalization and analysis cookies, both our own and from third parties, to facilitate anonymous browsing and analyze website usage statistics. We consider that if you continue browsing, you accept their use.