option
Home
News
Xiaomi's OmniVoice Open-Source TTS Model Enables Zero-Shot Cloning Across 600+ Languages

Xiaomi's OmniVoice Open-Source TTS Model Enables Zero-Shot Cloning Across 600+ Languages

May 8, 2026
90

Recently, the next-generation Kaldi team (k2-fsa) at Xiaomi officially open-sourced OmniVoice, a massive multilingual zero-shot text-to-speech model that supports over 600 languages. It achieves state-of-the-art results across multiple key benchmarks for Chinese, English, and multilingual synthesis, marking a significant breakthrough in the field.

Leading Performance: Chinese WER as Low as 0.84%, Outperforming Mainstream Models in Multilingual Tests

On the Seed-TTS Chinese test set, OmniVoice achieves a remarkably low word error rate (WER) of just 0.84%. In multilingual evaluations, its similarity (SIM-o) and WER scores surpass well-known commercial models like ElevenLabs v2 and MiniMax, demonstrating exceptional speech naturalness and clarity.

image.png

Ultra-Fast Inference: RTF as Low as 0.025, 40x Faster Than Real-Time

OmniVoice boasts a real-time factor (RTF) as low as 0.025, meaning its synthesis speed far exceeds real-time requirements. This massive efficiency gain enables the rapid generation of long-form speech in practical applications, greatly enhancing the user experience.

Core Architectural Innovation: Discrete Non-Autoregressive Design Inspired by Diffusion Models

OmniVoice employs a novel discrete non-autoregressive architecture inspired by diffusion language models. It generates speech from text in a single step, bypassing traditional intermediate semantic tokens. This streamlined design simplifies the pipeline while maintaining high output quality. A full codebook random masking strategy, combined with pre-trained LLM initialization, further boosts training efficiency and improves the final speech's clarity and intelligibility.

Flexible Voice Cloning & Customization: Works with Just 3-10 Seconds of Audio

The model supports high-quality zero-shot voice cloning using only 3-10 seconds of reference audio. Users can also customize voice attributes through natural language prompts, specifying gender, age, pitch, accent, dialect, and even special effects like whispering.

Handles Non-Linguistic Symbols & Fine-Grained Pronunciation Control

OmniVoice can process non-linguistic symbols, such as [laughter], and supports pronunciation correction via pinyin or phonetic symbols. This makes it particularly well-suited for precise synthesis in Chinese and various dialects.

Support for 600+ Languages: Aiding Digital Preservation of Minority and Endangered Languages

A key highlight of OmniVoice is its extensive language coverage, efficiently supporting both major and numerous low-resource languages. For minority and endangered languages, it can generate high-quality speech with minimal data samples, offering significant potential for digital language preservation and cultural protection.

OmniVoice's code and pre-trained models are now open-sourced on GitHub and Hugging Face, enabling developers to deploy it locally or integrate it into applications. AIbase will continue to monitor community feedback and real-world use cases. Developers are encouraged to share their experiences.

Project Link: https://github.com/k2-fsa/OmniVoice

Related article
How to protect assets, buildings, and personal health? How to protect assets, buildings, and personal health? In an unpredictable world, protection has become a strategic necessity—not just an option. Whether it's safeguarding finances, strengthening buildings, or focusing on personal health, long-term stability relies on proactive planning. True security is
AI Browser Comet Launches with Full Multitasking Support on iPad AI Browser Comet Launches with Full Multitasking Support on iPad Perplexity’s AI browser, Comet, has officially launched its iPad version, now fully compatible with iPadOS. The update introduces multi-window browsing, multitasking support, and deep integration with leading AI models like OpenAI and Anthropic, deli
Trace raises $3M to tackle enterprise AI agent adoption hurdles Trace raises $3M to tackle enterprise AI agent adoption hurdles Despite their potential, AI agents have struggled to gain traction in the enterprise. One emerging startup believes the core issue is a lack of context.Launched as part of Y Combinator’s 2025 summer cohort, Trace is a workflow orchestration startup d
Related Special Topic Recommendations
Business Top AI Pricing Optimization Software: Track Competitors & Auto-Adjust Store Prices
Top AI Pricing Optimization Software: Track Competitors & Auto-Adjust Store Prices

Discover the 2026 best AI pricing optimization software on XIX.AI. Our curated list features top-rated, game-changing tools that track competitors and auto-adjust your store prices for maximum profit. Compare free vs paid options with real-world tests. Unlock your pricing edge now.

10 tools
xix.ai
code Best AI Code Reviewers: Automate Clean Code Compliance & Refactor Legacy Repo Files
Best AI Code Reviewers: Automate Clean Code Compliance & Refactor Legacy Repo Files

Discover the 2026 best AI code reviewers on XIX.AI. Our curated list features top-rated, game-changing tools for automating clean code compliance and refactoring legacy repo files. Compare free vs paid options with real-world tests and weekly updated rankings. Unlock your AI edge today.

10 tools
xix.ai
Text-to-speech Top AI TTS Apps for Dyslexia: Support Learning and Reading Efficiency for Students
Top AI TTS Apps for Dyslexia: Support Learning and Reading Efficiency for Students

Discover the 2026 latest top-rated AI TTS apps curated for dyslexia support. Our expert rankings compare free vs paid tools, highlighting powerful features for enhanced reading efficiency and learning. Explore must-try, game-changing solutions to unlock student potential. Start your journey at XIX.AI.

10 tools
xix.ai
Comic Creation Top AI Generators for Shonen Manga: Create High-Octane Action Sequences & Energy Effects
Top AI Generators for Shonen Manga: Create High-Octane Action Sequences & Energy Effects

Discover the 2026 best AI generators for Shonen manga at XIX.AI. Our top-rated, curated list features powerful tools for creating high-octane action sequences and dynamic energy effects. Compare free vs paid options with real-world tests. Unlock your creative potential and start crafting epic manga today!

15 tools
xix.ai
Business Best AI Expense Trackers: Scan Receipts & Categorize Corporate Spend Automatically
Best AI Expense Trackers: Scan Receipts & Categorize Corporate Spend Automatically

2026 Latest Best AI Expense Trackers: Top-rated tools to scan receipts & categorize corporate spend automatically. Discover powerful, game-changing solutions for effortless expense management, accurate financial tracking, and streamlined compliance. Our curated, weekly-updated comparison of free vs paid options helps you find the perfect fit. Unlock your AI edge with XIX.AI's expert picks.

10 tools
xix.ai
Business Best AI Recruiting Tools: Screen Resumes & Automate Candidate Interview Scheduling
Best AI Recruiting Tools: Screen Resumes & Automate Candidate Interview Scheduling

Discover the 2026 latest top-rated AI recruiting tools on XIX.AI. Our curated list features powerful, game-changing solutions for screening resumes and automating candidate interview scheduling. Compare free vs paid options with real-world tests and weekly updated rankings. Find your perfect hiring assistant and streamline your recruitment today!

10 tools
xix.ai
Comments (0)
0/500
OR