Building specialized AI without sacrificing intelligence: Nova Forge data mixing in action — news

Amazon AWS Reveals Nova Forge Data Mixing Technique to Build Specialized AI Without Catastrophic Forgetting

SEATTLE — Amazon Web Services has demonstrated a data mixing approach for its Nova Forge platform that significantly reduces catastrophic forgetting when customizing frontier AI models for specialized enterprise tasks, according to a new evaluation by the AWS China Applied Science team.

The technique allows organizations to create optimized variants of Amazon’s Nova models — called “Novellas” — by intelligently blending proprietary data with Nova’s curated datasets. This addresses a core challenge in model customization: maintaining general intelligence and safety capabilities while adapting models for specific business workflows.

Data Mixing Preserves Core Model Abilities

In a comprehensive evaluation using a challenging Voice of Customer (VOC) classification task, the AWS China team benchmarked Nova Forge against open-source models. The results show that the data mixing approach substantially outperforms training with raw proprietary data alone.

According to the AWS blog post, the method “significantly reduces catastrophic forgetting compared to training with raw data alone,” helping preserve foundational skills including core intelligence, general instruction following capabilities, and safety guardrails.

The evaluation focused on production scenarios where models must support multiple general-purpose workflows. AWS recommends applying Nova data mixing specifically “when models are expected to support multiple general-purpose workflows in production, to reduce the risk of catastrophic forgetting.”

Nova Forge Enables “Open Training” of Frontier Models

Nova Forge, introduced earlier this year, empowers organizations to build their own frontier-level models by providing exclusive access to pre-trained, mid-trained, and post-trained Nova model checkpoints. This “open training” approach lets customers mix proprietary data with Amazon-curated datasets at every stage of the training process.

Customers can perform this training in their existing reinforcement learning “gyms,” where models interact with synthetic data and simulated scenarios that mirror real-world applications. This capability positions Nova Forge as a significant step toward democratizing frontier model development, traditionally limited to organizations with massive compute resources and proprietary training infrastructure.

The approach represents a competitive move by AWS in the rapidly evolving frontier AI landscape, where companies are racing to offer more customizable and enterprise-ready large language models while addressing the persistent challenge of balancing specialization with general capability.

Implications for Enterprise AI Development

For developers and organizations, Nova Forge’s data mixing technique offers a practical solution to a longstanding tension in AI deployment. Companies often need models finely tuned to their specific domain or customer data, but traditional fine-tuning frequently degrades a model’s general abilities — a phenomenon known as catastrophic forgetting.

By providing a validated methodology to mitigate this risk, AWS is enabling enterprises to deploy more specialized AI systems without sacrificing the broad capabilities that make frontier models valuable across multiple use cases.

The findings are particularly relevant for industries with complex, multi-faceted AI requirements such as customer service, content moderation, and specialized knowledge work, where models must maintain both deep domain expertise and robust general reasoning.

What’s Next for Nova Forge

AWS has not yet disclosed specific model sizes, benchmark scores, or detailed pricing for Nova Forge-powered Novella models. The company continues to position the platform as a way for organizations to create bespoke frontier models tailored to their unique needs while leveraging Amazon’s investment in foundational model development.

The AWS China Applied Science team’s evaluation provides early validation of the data mixing approach, though broader independent verification across different tasks and model scales will likely emerge as more customers adopt the platform.

As enterprise adoption of customized frontier models accelerates, techniques like Nova Forge’s data mixing could become standard practice for responsible model customization. AWS has indicated that continued research and best practices around data mixing will be shared as organizations gain more experience with the service.

The full technical evaluation is available in the AWS Machine Learning Blog post titled “Building specialized AI without sacrificing intelligence: Nova Forge data mixing in action.”

Building specialized AI without sacrificing intelligence: Nova Forge data mixing in action — news

Original Source

Comments