Hy3 Preview: First Open-Source Release After Yuan Restructuring with Enhanced Practicality and Agent Capabilities

Home

News

June 1, 2026

GregoryAdams

On April 23, Tencent’s Huan Yuan launched the Hy3preview language model as an open-source release. This hybrid expert model combines fast and slow reasoning, with 295 billion total parameters and 21 billion activated parameters, supporting a context length of up to 256K. It is the first model trained after Huan Yuan’s reconstruction and the most intelligent model in Huan Yuan’s history, delivering substantial gains in complex reasoning, instruction following, in-context learning, code generation, agent capabilities, and overall reasoning performance.

In February 2026, Tencent Huan Yuan restructured its pre-training and reinforcement learning infrastructure, along with establishing three guiding principles for achieving practical utility:

1. Systematic capabilities: Rather than emphasizing specialization, we recognize that even a single application like a code agent requires deep collaboration across reasoning, long-context processing, instruction following, dialogue, coding, and tool use.

2. Authentic evaluation: Moving beyond easily gamed public benchmarks, we assess and enhance the model’s real-world effectiveness using internally developed questions, latest exam sets, human evaluation, product-specific crowd testing, and other methods.

3. Cost-effectiveness focus: Practicality must align with commercial viability. The deeply co-designed model architecture and inference framework substantially lower task costs, making AI both affordable and effective.

Hy3preview marks the start of Huan Yuan’s accelerated pursuit of practical large language models that solve real-world problems.

Yao Shunyu, Tencent’s Chief AI Scientist, noted that Hy3preview is the first step in Huan Yuan’s model reconstruction. Through this open-source release, we look forward to genuine feedback from the community and users, which will help refine the practicality of the official Hy3 version. Meanwhile, we continue scaling up pre-training and reinforcement learning, pushing the model’s intelligence ceiling higher. By deeply co-designing with numerous Tencent products, we steadily enhance the model’s real-world performance and begin exploring specialized model capabilities.

Currently, Hy3preview is available on Tencent Cloud, Yua, ima, CodeBuddy, WorkBuddy, QQ, QQ Browser, Tencent Docs, and Tencent LeXiang. Additional major products like the WeChat Official Account, Peace Elite, Tencent News, Tencent Stock Selection, Tencent Customer Service, and WeChat Reading are rolling out gradually. Moreover, Hy3preview integrates with popular open-source agent frameworks such as OpenClaw, OpenCode, and KiloCode, and is listed on Tencent Cloud’s large model service platform, TokenHub.

Hy3preview emphasizes all-round practicality, with a substantial boost in agent capabilities

Multiple evaluations confirm that Hy3preview’s capabilities have seen comprehensive improvements.

1. Outstandingin-contextlearning and instruction-following capabilities

In diverse real-world production and daily scenarios, parsing messy, lengthy contexts and adhering to complex, evolving rules remains a key challenge for models. Drawing from Tencent’s business use cases, Huan Yuan introduced CL-bench and CL-bench-Life to innovatively assess in-context learning, and significantly enhanced Hy3preview’s context learning and instruction-following abilities.

2. Exceptional complex reasoning capability, achieving the top score in Tsinghua University’s mathematics doctoral qualification exam in China

Complex reasoning underpins the model’s ability to tackle varied problems. Hy3preview excelled in challenging STEM reasoning benchmarks like FrontierScience-Olympiad and IMOAnswerBench, and achieved outstanding scores in the latest Tsinghua University Quzhen Academy Mathematics Doctoral Qualification Exam (Spring 2026) and the National High School Biology Competition (CHSBO2025), showcasing robust generalized reasoning.

3. Major advancements in code and agent capabilities, demonstrating strong cost-effectiveness

Code and agent capabilities are the most notable improvements in Hy3preview. Thanks to the revamped pre-training and reinforcement learning infrastructure and the expanded scale of RL tasks, Tencent Huan Yuan quickly attained competitive scores on leading code agent benchmarks like SWE-Bench Verified and Terminal-Bench2.0, as well as search agent benchmarks like BrowseComp and WideSearch.

In the digital domain, code measures the model’s ability to execute tasks in development environments, while search assesses its capacity to retrieve, filter, and synthesize information from open sources. Together, these determine whether the model is genuinely useful in complex agent scenarios like OpenClaw. Hy3preview achieved strong results on evaluations such as ClawEval and WildClawBench, showing that our agent capabilities are steadily progressing toward completeness and practicality.

Beyond public benchmarks, Tencent Huan Yuan constructed multiple internal evaluation suites to gauge the model’s performance in real development contexts. Results indicate that across the backend engineering task set Hy-Backend, the developer-centric Hy-Vibe Bench, and the challenging software engineering set Hy-SWE Max, Hy3preview exhibited strong competitiveness.

When comparing model size and overall agent performance among open-source alternatives, Hy3preview stands out for its high cost-effectiveness.

Tencent core businesses are fully integrated, and multiple key AI products show clear benefits

Prior to its official launch, Hy3preview was tested across Tencent’s major AI products, yielding noticeable positive returns.

On the Yua front, Huan Yuan and Yua underwent deep co-design. The model’s performance on key metrics like intent understanding accuracy, text generation quality, and deep search was improved, while also being fine-tuned for writing style, expression, emotional intelligence, content structure, and professionalism. This close model-product collaboration delivered a more intelligent and human-like interaction experience for users.

In ima’s knowledge base QA and general QA scenarios, tests revealed that Hy3preview excels at long-text processing, particularly in retrieval tasks, where it achieves high accuracy, coverage, and comprehensiveness in responses.

In CodeBuddy and WorkBuddy, Hy3preview’s first-token latency dropped by 54%, end-to-end duration decreased by 47%, and success rate climbed to over 99.99%. In real user environments, it stably drives complex agent workflows of up to 495 steps, spanning diverse office tasks like document processing, data analysis, knowledge retrieval, and MCP toolchain orchestration.

In dedicated evaluations for WeChat Official Account’s AI avatar and AI customer service, Hy3preview delivered more comprehensive upgrades over Hy2. It demonstrated greater maturity in user intent understanding, complex context continuation, and knowledge organization. When dealing with ambiguous queries, short sentences, and multi-turn dialogues, it better grasped user needs and produced clearer, more stable responses. By integrating knowledge bases, user memory, and contextual generation, its outputs aligned more closely with the AI avatar or customer service role, significantly reducing over-imagination, subjective assumptions, and emotional tone—bringing the overall interaction closer to a 'trustworthy, natural, and efficient' experience.

In Peace Elite’s AI NPC scenario, the team quickly integrated and evaluated Hy3preview after its release, with impressive overall results. For out-of-game character role-playing, Hy3preview accurately grasped character settings and delivered highly relevant, value-added content for open-ended questions, creating a more realistic, natural, and immersive conversation. During complex in-game battle scenarios, the model’s response timing felt close to that of real players, showing excellent stability and human-like role-playing skills—making its overall performance exceptional.

In Tencent Docs’ AI PPT scenario, compared to the previous Hy2 version, Hy3preview showed significant gains: generation success rate rose by 20%, evaluation scores improved by 10%, and generation time dropped by 20%. Overall, the new model excelled in template selection, color matching, outline generation, and content supplementation—free from hallucinations, thematically aligned, and with strong visual appeal.

For QQ’s AI assistant Xiao Q, compared to the prior version, Hy3preview brought major optimizations in long-text first-byte latency, overall response speed, and streaming efficiency. Core capabilities like mathematical reasoning saw significant improvement, while multi-scenario instruction following and generalization were further enhanced. In tool-calling reasoning and multi-turn reference resolution, it delivered more stable and efficient performance. On the official PinchBench QQ intelligent agent scenario test by OpenClaw, it achieved outstanding results, and the overall user experience improved markedly.

Reasoning efficiency improved by 40%, delivering optimal intelligence density at the same cost

Thanks to deep collaboration between the model and inference framework, along with comprehensive optimizations in the inference framework, operator performance, quantization algorithms, and more, overall reasoning efficiency improved by 40%, and Hy3preview’s cost dropped significantly relative to the previous generation.

On Tencent Cloud’s large model service platform TokenHub, Hy3preview’s input price is as low as 1.2 yuan per million tokens, input cache at 0.4 yuan per million tokens, and output at 4 yuan per million tokens. Additionally, Tencent Cloud and Huan Yuan have jointly introduced a customized Hy3preview Token Plan package, with the personal edition starting at 28 yuan per month—offering a cost-effective option for agent development and building 'Lobster' applications.

Anthropic's experimental AI Claude completes negotiations and transactions in e-commerce test As artificial intelligence advances rapidly, Anthropic quietly rolled out an internal experiment called "Project Deal" last Friday, showcasing AI's potential in e-commerce. The experiment had its AI model Claude autonomously handle buying, selling, a

DeepSeek Code poised for launch As AI technology accelerates, DeepSeek is at a thrilling juncture. The AI company recently revealed it has secured over 70 billion yuan in funding. Leadership has emphasized a commitment to groundbreaking AI research over immediate commercial gains.

Musk’s Grok: 1.5 Trillion Parameters and Cursor Code Absorption—Game Changer or Bluff? Elon Musk is finally making a move.In the AI programming race, OpenAI and Anthropic are accelerating, while xAI appears to be lagging. Musk has often stated his aim to rival Claude, yet despite multiple updates to the Grok4.X series, the results look