ERNIE Blog — news
News/2026-03-08-ernie-blog-news-news
Breaking NewsMar 8, 20264 min read
?Unverified·First-party

ERNIE Blog — news

Featured:Baidu
ERNIE Blog — news

Baidu Unveils ERNIE 5.0, 2.4T-Parameter Unified Multimodal Model at Baidu World 2025

BEIJING — Baidu on Thursday officially launched ERNIE 5.0, a 2.4 trillion-parameter unified multimodal foundation model trained from scratch, along with a suite of new AI applications and tools as the Chinese technology giant accelerates its push into global multimodal AI.

The company introduced ERNIE 5.0 as a “unified native multimodal model” at its Baidu World 2025 conference, according to founder, chairman and CEO Robin Li. The model integrates text, image, video and audio within a single autoregressive framework, aiming to overcome limitations of traditional late-fusion architectures. Baidu also unveiled upgrades to its digital human technology, the no-code application builder Miaoda, the general AI agent GenFlow, the self-evolving agent Famou, and the one-stop AI workspace Oreate.

The announcement comes as Baidu seeks to remain competitive against global leaders in the rapidly evolving multimodal AI race.

Technical Details and Model Family

According to Baidu’s ERNIE Blog, ERNIE 5.0 represents a significant architectural leap with its 2.4 trillion parameters, enabling native understanding and generation across multiple modalities in one unified system. The company claims the model achieves frontier performance across multimodal domains.

Baidu separately introduced the open-source ERNIE 4.5 model family, which includes 10 distinct variants. This family features Mixture-of-Experts (MoE) architectures with 47 billion and 3 billion active parameters; the largest model in the 4.5 series contains 424 billion total parameters. The open-source release is expected to provide developers with accessible high-performance multimodal capabilities.

The company also announced new AI chips alongside the software advancements, though specific performance benchmarks and detailed technical specifications for ERNIE 5.0 were not fully disclosed in initial announcements.

Broader Product Ecosystem

Beyond the flagship model, Baidu demonstrated practical applications of its AI technology. Upgrades to its digital human technology aim to create more realistic virtual presenters and assistants. The enhanced Miaoda no-code builder allows users to develop AI applications without programming expertise, while GenFlow serves as a general AI agent framework.

The new self-evolving agent Famou is designed to improve autonomously over time, and the Oreate workspace provides an integrated environment for AI development and deployment. Select offerings from this suite are being positioned for international markets as part of Baidu’s global expansion strategy.

These tools reflect Baidu’s strategy of combining foundational model advances with immediately usable enterprise and consumer applications, similar to approaches taken by competitors like OpenAI, Google and Anthropic.

Industry Impact

For developers and enterprises, the release of both the proprietary ERNIE 5.0 and the open-source ERNIE 4.5 family offers new options in the multimodal AI space. The open-source models, in particular, could accelerate adoption by lowering barriers to experimentation with large-scale MoE architectures.

The emphasis on unified multimodal processing — rather than separate models stitched together — addresses a key technical challenge in the industry. Late-fusion approaches often struggle with deep cross-modal reasoning, whereas native unified models like ERNIE 5.0 potentially enable more coherent understanding across text, vision and audio.

Baidu’s continued heavy investment in AI infrastructure, including new chips, signals its commitment to reducing reliance on foreign technology amid ongoing U.S.-China tensions in the semiconductor sector.

What’s Next

Baidu has not yet released a specific timeline for public API access to ERNIE 5.0 or full technical paper details. The company indicated that select applications and tools will be made available globally in the coming months.

Industry observers will be watching for independent benchmark results comparing ERNIE 5.0 against leading models from OpenAI, Google DeepMind and Anthropic. The open-source ERNIE 4.5 models are expected to appear on platforms such as Hugging Face in the near term, allowing the research community to validate Baidu’s performance claims.

As multimodal AI becomes central to next-generation applications in robotics, autonomous driving through its Apollo Go platform, and content creation, Baidu’s latest releases position it as a significant player in the global competition beyond its traditional Chinese market.

Original Source

yiyan.baidu.com

Comments

No comments yet. Be the first to share your thoughts!