Sign In

Communications of the ACM

ACM TechNews

Artificial Data Give the Same Results as Real Data--Without Compromising Privacy


View as: Print Mobile App Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook
A representation of artificial data.

Massachusetts Institute of Technology researchers have developed a machine-learning system that automatically creates synthetic data.

Credit: MIT News

Researchers at the Massachusetts Institute of Technology (MIT) have developed the Synthetic Data Vault (SDV), a machine-learning system that automatically creates synthetic data. Such artificial data can be used in data science efforts that otherwise would be thwarted due to limited access to authentic data.

The use of authentic data raises significant privacy concerns, and the synthetic data can still be used to develop and test data science algorithms and models.

The SDV algorithm, known as a recursive conditional parameter aggregation, exploits the hierarchical organization of data common to all databases.

The researchers found the synthetic data can successfully replace real data in software writing and testing. They also note the SDV can be scaled to create very small or very large synthetic datasets, facilitating rapid development cycles or stress tests for big data systems.

From MIT News
View Full Article

 

Abstracts Copyright © 2017 Information Inc., Bethesda, Maryland, USA


 

No entries found