Hume AI Releases TADA: Open-Source Mobile TTS with 5x Speed and No Hallucinations

Hume AI has open-sourced its latest speech generation model, TADA (Text-Acoustic Dual Alignment). This text-to-speech (TTS) system, built on a large language model foundation, utilizes an innovative dual-alignment architecture for text and acoustics. This approach significantly boosts generation efficiency, reliability, and expands its range of practical applications.
As officially detailed, TADA establishes a strict 1:1 synchronization between text tokens and acoustic representations. This architecture completely resolves the common issue of token-level content hallucination found in traditional LLM-based TTS systems. In evaluations involving over 1,000 test samples, the model demonstrated zero instances of content hallucination.
Regarding performance, TADA generates audio over five times faster than comparable LLM TTS systems. It also operates with remarkable resource efficiency, requiring only 2-3 frames of computational resources per second of audio. In contrast, conventional solutions typically need between 12.5 to 75 frames. This efficiency enables the model to run local inference on low-power hardware, such as smartphones and edge devices, eliminating the need for cloud servers.
TADA offers multilingual support, including Chinese, with its multilingual versions based on the Llama3.23B parameter scale. The release includes 1B (primarily for English) and 3B multilingual pre-trained models. With a context window of 2048 tokens, the model can generate approximately 700 seconds of continuous audio in a single pass. This capability far exceeds traditional solutions, which are typically limited to about 70 seconds under the same token constraints.
A key innovation is its synchronous transcription feature. While generating speech, the model concurrently outputs the corresponding text transcription. This process eliminates the need for a separate, additional automatic speech recognition (ASR) step, resulting in zero added latency for text output. This functionality is particularly valuable for real-time captioning, voice interaction systems, and content creation tools.
In human subjective evaluations, TADA achieved second place for both naturalness and voice similarity. It outperformed several systems with larger parameter counts and more extensive training data, showcasing highly competitive audio quality.
Link: https://huggingface.co/collections/HumeAI/tada
Related article
Ali's Large Model Push: Qwen Digital Human Debuts, Core Ecosystem Integrates
As the AI competition moves into the application layer phase, Alibaba has made a key move by integrating its AI ecosystems. On April 22, Alibaba officially unveiled a unified AI digital persona named 'Qwen Xiaojiuwo,' which gives Tongyi Qianwen a mor
Alibaba Q4 Fiscal 2026 Report: AI Revenue Surges, BaiLian Platform ARR Tops 10 Billion Yuan
Alibaba Group today released its Q4 and full-year 2026 financial results, indicating that its AI-driven cloud business is experiencing explosive growth. The report shows that revenue for the fourth fiscal quarter reached 243.38 billion yuan, up 11% y
Elon Musk Loses Lawsuit Against Sam Altman and OpenAI
Elon Musk's assertion that OpenAI's co-founders wronged him collapsed when nine California jurors unanimously ruled that his lawsuits were filed too late.Musk alleged that Sam Altman, Greg Brockman, OpenAI, and Microsoft "stole a charity" by establis
Related Special Topic Recommendations
Comments (1)
0/500

Hume AI has open-sourced its latest speech generation model, TADA (Text-Acoustic Dual Alignment). This text-to-speech (TTS) system, built on a large language model foundation, utilizes an innovative dual-alignment architecture for text and acoustics. This approach significantly boosts generation efficiency, reliability, and expands its range of practical applications.
As officially detailed, TADA establishes a strict 1:1 synchronization between text tokens and acoustic representations. This architecture completely resolves the common issue of token-level content hallucination found in traditional LLM-based TTS systems. In evaluations involving over 1,000 test samples, the model demonstrated zero instances of content hallucination.
Regarding performance, TADA generates audio over five times faster than comparable LLM TTS systems. It also operates with remarkable resource efficiency, requiring only 2-3 frames of computational resources per second of audio. In contrast, conventional solutions typically need between 12.5 to 75 frames. This efficiency enables the model to run local inference on low-power hardware, such as smartphones and edge devices, eliminating the need for cloud servers.
TADA offers multilingual support, including Chinese, with its multilingual versions based on the Llama3.23B parameter scale. The release includes 1B (primarily for English) and 3B multilingual pre-trained models. With a context window of 2048 tokens, the model can generate approximately 700 seconds of continuous audio in a single pass. This capability far exceeds traditional solutions, which are typically limited to about 70 seconds under the same token constraints.
A key innovation is its synchronous transcription feature. While generating speech, the model concurrently outputs the corresponding text transcription. This process eliminates the need for a separate, additional automatic speech recognition (ASR) step, resulting in zero added latency for text output. This functionality is particularly valuable for real-time captioning, voice interaction systems, and content creation tools.
In human subjective evaluations, TADA achieved second place for both naturalness and voice similarity. It outperformed several systems with larger parameter counts and more extensive training data, showcasing highly competitive audio quality.
Link: https://huggingface.co/collections/HumeAI/tada
Ali's Large Model Push: Qwen Digital Human Debuts, Core Ecosystem Integrates
As the AI competition moves into the application layer phase, Alibaba has made a key move by integrating its AI ecosystems. On April 22, Alibaba officially unveiled a unified AI digital persona named 'Qwen Xiaojiuwo,' which gives Tongyi Qianwen a mor
Alibaba Q4 Fiscal 2026 Report: AI Revenue Surges, BaiLian Platform ARR Tops 10 Billion Yuan
Alibaba Group today released its Q4 and full-year 2026 financial results, indicating that its AI-driven cloud business is experiencing explosive growth. The report shows that revenue for the fourth fiscal quarter reached 243.38 billion yuan, up 11% y
Elon Musk Loses Lawsuit Against Sam Altman and OpenAI
Elon Musk's assertion that OpenAI's co-founders wronged him collapsed when nine California jurors unanimously ruled that his lawsuits were filed too late.Musk alleged that Sam Altman, Greg Brockman, OpenAI, and Microsoft "stole a charity" by establis





Home






