Tongyi Lab Debuts Fun-CosyVoice3.5 and Fun-AudioGen-VD Speech Models
Today, Tongyi Lab officially unveiled two FreeStyle-enabled voice generation models: Fun-CosyVoice3.5 and Fun-AudioGen-VD. This launch signifies a paradigm shift in speech synthesis, moving from reliance on preset tags to a new framework based on natural language instructions. It achieves a deeply interactive experience, enabling users to "generate speech freely with a single sentence."


Regarding technical architecture and functional upgrades, Fun-CosyVoice3.5 emphasizes multilingual voice cloning and nuanced expression, now adding support for four new languages, including Thai and Indonesian. By integrating DiffRO and GRPO reinforcement learning technologies, the model achieves substantial improvements in prosody and audio quality similarity. Its error rate for rare characters has decreased from 15.2% to 5.3%, and initial packet delay has been reduced by 35%. Complementing this, Fun-AudioGen-VD focuses on sound design and scenario modeling. It supports precise, instruction-based control over gender, emotion, and spatial acoustics, enabling the simulation of complex, integrated scenarios—from a "crazy villain" to a "noisy café" ambiance.
From an industry trend perspective, Tongyi Lab 's initiative elevates speech generation from a simple conversion tool to a full-fledged creation tool. This descriptive and programmable digital expression capability directly empowers sectors like film, gaming, and AI avatars. It reduces content creation costs while significantly expanding the semantic richness of human-computer interaction.
API: https://help.aliyun.com/zh/model-studio/text-to-speech?spm=a2c4g.11186623.help-menu-2400256.d_0_3_2_0.d5536a31V2tEJP
Documentation: https://help.aliyun.com/zh/model-studio/cosyvoice-clone-api?spm=a2c4g.11186623.help-menu-search-2400256.d_2
Related article
Bain forecasts US$100 billion SaaS market in agentic AI automation
Bain & Company has estimated a $100 billion market in the U.S. for SaaS companies leveraging agentic AI. The firm said this market stems from automating coordination tasks within enterprise systems.This estimate comes from the second installment in B
AI Search Mandatory Policy Fuels Exodus, DuckDuckGo Sees User Surge
Following Google's 2026 I/O conference announcement of a full AI overhaul of its search engine, many users started looking for more controllable alternatives because there was no simple "one-click disable" for AI features. The privacy-focused search
Xiaohongshu Restructures: Conan Named President, Creates AI Primary Department Dots and Overseas Division Rednote
On April 30, Xiaohongshu sent an internal memo to all employees announcing the launch of a new organizational restructuring. The core of this change involves fully integrating three business lines—community, e-commerce, and commercialization—along wi
Related Special Topic Recommendations
Comments (0)
0/500
Today,


Regarding technical architecture and functional upgrades,
From an industry trend perspective,
API: https://help.aliyun.com/zh/model-studio/text-to-speech?spm=a2c4g.11186623.help-menu-2400256.d_0_3_2_0.d5536a31V2tEJP
Documentation: https://help.aliyun.com/zh/model-studio/cosyvoice-clone-api?spm=a2c4g.11186623.help-menu-search-2400256.d_2
AI Search Mandatory Policy Fuels Exodus, DuckDuckGo Sees User Surge
Following Google's 2026 I/O conference announcement of a full AI overhaul of its search engine, many users started looking for more controllable alternatives because there was no simple "one-click disable" for AI features. The privacy-focused search
Xiaohongshu Restructures: Conan Named President, Creates AI Primary Department Dots and Overseas Division Rednote
On April 30, Xiaohongshu sent an internal memo to all employees announcing the launch of a new organizational restructuring. The core of this change involves fully integrating three business lines—community, e-commerce, and commercialization—along wi





Home






