MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
data
Search

AWS adds incremental and distributed training to Clean Rooms for scalable ML collaboration

Thursday July 3, 2025. 01:59 PM , from InfoWorld
AWS has rolled out new capabilities for its Clean Rooms service, designed to accelerate machine learning model development for enterprises while addressing data privacy concerns.

The updates, including incremental and distributed training, are designed to help enterprises, particularly in regulated industries, analyze shared datasets securely without copying or exposing sensitive information.

Analysts say the enhancements come amid rising demand for predictive analytics and personalized recommendations.

“The need for secure data collaboration is more critical than ever, with the need to protect sensitive information, yet share data with partners and other collaborators to improve machine learning models with collective data,” said Kathy Lange, research director at IDC.

“Often, enterprises cannot collect enough of their own data to cover a broad spectrum of outcomes, particularly in healthcare applications, disease outbreaks, or even in financial applications, like fraud or cybersecurity,” Lange added.

Incremental training to help with agility

The incremental training ability added to Clean Rooms will help enterprises build upon existing model artifacts, AWS said.Model artifacts are the key outputs from the training process — such as files and data — that are required to deploy and operationalize a machine learning model in real-world applications.

The addition of incremental training “is a big deal,” according to Keith Townsend, founder of The Advisor Bench — a firm that provides consulting services for CIOs and CTOs.

“Incremental training allows models to be updated as new data becomes available — for example, as research partners contribute fresh datasets — without retraining from scratch,” Townsend said.Seconding Townsend, Everest Group analyst Ishi Thakur said that the ability to update models with incremental data will bring agility to model development.

“Teams on AWS clean rooms will now be able to build on existing models, making it easier to adapt to shifting customer signals or operational patterns. This is particularly valuable in sectors like retail and finance where data flows continuously and speed matters,” Thakur said.

Typically, AWS Clean Rooms in the machine learning model context is used by enterprises for fraud detection, advertising, and marketing, said Bradley Shimmin, lead of the data and analytics practice at The Futurum Group.

“The service is focused on building lookalike models, which is a predictive ML model of the training data that can be used to find similar users in other datasets. So, something specific to advertising use cases or fraud detection,” Shimmin added.

Distributed training to help scale model development

The distributed training ability added to Clean Rooms will help enterprises scale model development, analysts said.

“This capability helps scale model development by breaking up complex training tasks across multiple compute nodes, which is a crucial advantage for enterprises grappling with high data volumes and compute-heavy use cases,” Thakur said.

Explaining how distributed training works, IDC’s Lange pointed out that AWS Clean Rooms ML — a feature inside the Clean Rooms service — uses Docker images that are SageMaker-compatible and stored in Amazon Elastic Container Registry (ECR). 

“This allows users to leverage SageMaker’s distributed training capabilities, such as data parallelism and model parallelism, across multiple compute instances, enabling scalable, efficient training of custom ML models within AWS Clean Rooms,” Lange said, adding that other AWS components, such as AWS Glue — a serverless data integration service, are also involved.

Further, The Advisor Bench’s Townsend pointed out that AWS Clean Rooms’ distributed training feature will specifically help in use cases when one partner of the two-stakeholder enterprises doesn’t have deep expertise in distributed machine learning infrastructure.   

Vendors such as Microsoft, Google, Snowflake, Databricks, and Salesforce also offer data clean rooms.While Microsoft offers Azure Confidential Clean Room as a service designed to facilitate secure, multi-party data collaboration, Google offers BigQuery Clean Room — a tool that is built on BigQuery’s Analytics Hub and is focused on multi-party data analysis where data from a variety of contributors can be combined, with privacy protections in place, without the need to move or expose raw data.

Salesforce’s clean rooms feature is expected to be added to its Data Cloud by the end of the year, Shimmin said.

Demand for clean rooms expected to grow

The demand for clean rooms is expected to grow in the coming months.

“I expect we’ll see increased interest in and adoption of Clean Rooms as a service in the next 12-18 months,” said Shelly Kramer, founder and principal analyst at Kramer & Company, pointing out the deprecation of third-party cookies and increasingly challenging privacy regulations. “In data-driven industries, solutions for first-party data collaboration that can be done securely are in demand. That’s why we are seeing Clean Rooms as a service quickly becoming a standard. While today the early adopters are in some key sectors, the reality is that all enterprises today are, or should be, data-driven.”

On the other hand, IDC’s Lange pointed out that demand for clean rooms is growing specifically due to the rise in data volumes and data variety that are being stored and analyzed for patterns.

However, Kramer pointed out that enterprises may have challenges around the adoption of clean rooms.

“Integrating with existing workflows is a key challenge, as clean rooms don’t naturally fit within standard campaign planning and measurement processes. Therefore, embedding and operationalizing them effectively can require some effort,” Kramer said.
https://www.infoworld.com/article/4016728/aws-adds-incremental-and-distributed-training-to-clean-roo...

Related News

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network
Current Date
Jul, Thu 3 - 23:06 CEST