Introducing Scribe v2 — news

ElevenLabs Launches Scribe v2, Claims Most Accurate Transcription Model Across 90+ Languages

NEW YORK — ElevenLabs on Thursday introduced Scribe v2, positioning it as the most accurate transcription model the company has ever released and expanding its speech-to-text capabilities with support for more than 90 languages.

The launch, announced via the company’s official blog, targets high-volume use cases including batch transcription, subtitles and captioning at scale. A separate real-time variant, Scribe v2 Realtime, was also unveiled, promising low-latency performance with live transcription delivered in under 150 milliseconds. The dual release underscores ElevenLabs’ push into both offline and live speech-to-text markets, areas traditionally dominated by established players such as OpenAI’s Whisper, Google Cloud Speech-to-Text and AssemblyAI.

According to ElevenLabs, Scribe v2 represents a significant leap in transcription accuracy compared to its predecessor. The model was built specifically for enterprise-grade workloads where precision and broad language coverage are critical. While the company has not yet published independent third-party benchmark numbers, its official announcement repeatedly describes Scribe v2 as “the most accurate transcription model ever released.”

Technical Capabilities and Use Cases

Scribe v2 supports more than 90 languages, a notable expansion that should appeal to global media companies, content platforms and localization teams. The model is optimized for batch processing, making it suitable for transcribing large archives of audio and video content or generating subtitles for on-demand streaming services.

The companion Scribe v2 Realtime variant focuses on low-latency scenarios. ElevenLabs claims it is “the most accurate low-latency Speech to Text model,” capable of returning results in under 150 ms. This performance level is particularly relevant for voice agents, AI meeting assistants, live captioning and real-time notetaking applications.

The company has positioned both models as production-ready tools rather than research prototypes. Early community reactions on Reddit’s r/ElevenLabs forum suggest developers are already exploring integration for podcast platforms, customer support transcription and multilingual content workflows.

Competitive Context

ElevenLabs has primarily been known for its high-quality text-to-speech and voice cloning technology. The aggressive expansion into speech-to-text with Scribe v2 signals the company’s intention to become a full-stack audio AI provider. This move places it in direct competition with specialized transcription companies as well as hyperscalers offering similar services through cloud APIs.

The emphasis on both accuracy and language coverage addresses two persistent pain points in the speech recognition industry: error rates in non-English languages and inconsistent performance across diverse accents and dialects. By claiming superiority in accuracy while supporting 90+ languages, ElevenLabs is challenging competitors to match both metrics simultaneously.

Impact on Developers and Enterprises

For developers, the launch offers new options for building multilingual applications without stitching together multiple transcription providers. The combination of high-accuracy batch processing and sub-150ms real-time capabilities could simplify architecture for products ranging from automated subtitling tools to conversational AI agents.

Enterprise users in media, education and legal sectors stand to benefit from improved transcription quality, potentially reducing the need for costly human post-editing. However, as with any new model release, real-world performance will need to be validated across varied audio conditions, background noise levels and technical terminology domains.

What’s Next

ElevenLabs has not yet disclosed specific pricing details, API rate limits or exact model size for Scribe v2. The company is expected to publish additional technical documentation, benchmark comparisons and integration guides in the coming weeks.

Developers can currently access the new models through ElevenLabs’ existing platform. The company has indicated that Scribe v2 and Scribe v2 Realtime are immediately available for use in batch and live transcription workflows.

As the audio AI sector continues to heat up, ElevenLabs’ dual release of Scribe v2 positions the company as a stronger contender in the speech-to-text arena. The coming months will reveal how the models perform against established benchmarks and whether the claimed accuracy gains hold up under independent scrutiny across its expanded 90-language roster.

Introducing Scribe v2 — news

Technical Capabilities and Use Cases

Competitive Context

Impact on Developers and Enterprises

Original Source

Comments