iToverDose/Startups· 20 MAY 2026 · 12:00

Corti’s clinical-grade speech model outperforms OpenAI in medical accuracy

Corti’s new Symphony for Speech-to-Text model slashes word error rates by 93% in medical contexts, proving specialized AI can outperform general-purpose alternatives in high-stakes healthcare settings.

VentureBeat2 min read0 Comments

Healthcare AI developer Corti has unveiled Symphony for Speech-to-Text, a clinical-grade speech recognition model designed to address the unique challenges of medical transcription. The model achieves a 1.4% word error rate (WER) on English medical terminology—dramatically lower than general-purpose alternatives like OpenAI’s Whisper (17.4%) and ElevenLabs (18.1%), according to data in a newly published research paper.

Andreas Cleve, co-founder and CEO of Corti, emphasized the model’s importance in an exclusive interview. “Our focus is on building AI scribes that physicians, medical professionals, and patients can trust across the entire healthcare system,” he said. The launch underscores a growing trend: in highly regulated industries like healthcare, domain-specific AI models often deliver superior performance compared to generalist foundation models.

Breaking through the limits of general-purpose speech models

The healthcare industry’s reliance on accurate transcription has evolved beyond simple dictation. As AI systems transition into the “agentic era”—where autonomous agents assist in real-time clinical decision-making—the quality of input data directly impacts patient safety. General-purpose speech models frequently misinterpret medical terminology, dosage instructions, or emergency room noise, creating downstream risks for AI-driven workflows.

Corti’s Symphony for Speech-to-Text addresses these gaps by providing structured, clinically usable output. Its architecture reduces the compounding errors that occur when general models hallucinate transcriptions—such as mistaking “hyperthyroidism” for “hypothyroidism” or misreading critical medication details. The model’s entity recall rate for clinical terms (dosages, measurements, dates) reaches 98.3%, compared to just 44.3% for leading general-purpose baselines—a 54% performance gap that could determine whether an AI tool enhances efficiency or introduces liability.

Outperforming legacy systems and reshaping clinical workflows

While modern general-purpose models struggle with medical contexts, legacy transcription systems like Dragon Medical One have long dominated clinician dictation. However, these systems were optimized for deliberate dictation rather than the ambient, real-time, or multi-party conversations required in today’s AI-driven healthcare environments.

In evaluations of real-world English medical dictation, Corti’s model achieved a 4.6% WER, surpassing Dragon’s 5.7% (a 19% relative improvement). It also demonstrated slightly higher medical term recall (93.5% versus 92.9%). By offering an API endpoint, Corti enables third-party developers, EHR vendors, and virtual care platforms to integrate high-accuracy speech recognition into their tools—without being constrained by outdated infrastructure.

“Our goal is to empower developers to build atop our models,” Cleve stated. “By diffusing this technology widely, we can make it as helpful as possible to patients, doctors, and healthcare professionals.”

Expanding beyond English to global healthcare needs

The demand for precise clinical speech recognition extends far beyond English-speaking hospitals. Healthcare systems worldwide, particularly those requiring multilingual support, have historically lacked access to clinical NLP models tailored to their linguistic and operational demands.

Early adopters are already testing Corti’s models in complex international markets, including Switzerland, where care delivery often spans multiple languages within a single interaction. This expansion highlights how specialized AI can bridge gaps left by general-purpose solutions, ensuring accurate transcription and documentation in diverse clinical settings.

As healthcare AI continues to evolve, models like Symphony for Speech-to-Text represent a critical step toward safer, more reliable automation. For medical professionals and patients alike, the stakes couldn’t be higher—and the difference between a 1.4% WER and a 17.4% WER could define the future of clinical AI.

AI summary

Corti'nin yeni Symphony konuşma tanıma modeli, medical terminoloji doğruluğunda lider oluyor ve sağlık sektöründe devrim yaratıyor

Comments

00
LEAVE A COMMENT
ID #1FJKEB

0 / 1200 CHARACTERS

Human check

7 + 7 = ?

Will appear after editor review

Moderation · Spam protection active

No approved comments yet. Be first.