Hume AI launches TADA, a fast open-source voice system that eliminates hallucinations

10/03/2026

Hume AI releases TADA under an open-source license, a text-to-speech system that synchronizes text and audio to eliminate content errors and achieve five times the speed of current systems.

Hume AI launches TADA, a fast open-source voice system that eliminates hallucinations

Hume AI has released TADA (Text-Acoustic Dual Alignment), a voice generation system that addresses one of the most common problems in current large language model-based systems: the mismatch between how text and audio are represented.

Conventional text-to-speech systems generate between 12.5 and 75 acoustic signal frames per second of audio, compared to just 2 or 3 text tokens. This gap forces models to handle very long sequences, which slows down processing and increases the risk of the system skipping words or inserting non-existent content — a flaw known as hallucination.

TADA resolves this imbalance with a tokenization scheme that assigns exactly one continuous acoustic vector per text token. As a result, text and audio are processed in parallel and at the same rate, without compressing the audio or adding extra intermediate layers.

In terms of speed, the system achieves a real-time factor of 0.09 — more than five times faster than comparable LLM-based text-to-speech systems. In tests with over 1,000 samples from the LibriTTSR dataset, the model produced zero hallucinations. In human evaluations on expressive, long-form speech, it scored 4.18 out of 5 for speaker similarity and 3.78 out of 5 for naturalness, ranking second overall.

The model's compact size allows it to run on mobile devices without relying on cloud services. In terms of context management, it can handle up to 700 seconds of audio within a 2,048-token context window, compared to around 70 seconds for conventional systems under the same conditions.

Hume AI is releasing two versions: a one-billion-parameter model for English and a three-billion-parameter multilingual model supporting eight languages. Both are available on Hugging Face under an open-source license. The researchers themselves acknowledge limitations still to be resolved, including potential speaker drift during very long generations and reduced text quality when generating text and speech simultaneously.

Key points

  • TADA is a new open-source text-to-speech system developed by Hume AI.
  • It synchronizes text and audio in a 1:1 ratio, eliminating the mismatch in current systems.
  • It is more than five times faster than comparable LLM-based TTS systems.
  • In tests with over 1,000 samples, it produced zero hallucinations.
  • It is lightweight enough to run on mobile devices without a cloud connection.
  • It can handle up to 700 seconds of audio versus 70 seconds in conventional systems.
  • Available in two versions: 1B parameters in English and 3B multilingual across eight languages.
  • Still has limitations in very long generations and when combining text and speech simultaneously.

Videos

Related AI

Hume

Interfaz de voz con inteligencia emocional

Research laboratory and technology company specialized in AI models with emotional intelligence. Its main model integrates voice and language processing, with adjustable voice synthesis in timbre, ...

Lastest news

Trustpilot
This website uses technical, personalization and analysis cookies, both our own and from third parties, to facilitate anonymous browsing and analyze website usage statistics. We consider that if you continue browsing, you accept their use.