Sign In

Communications of the ACM

ACM TechNews

Making Big Data a Little Smaller

Pre-processing large data into lower dimensions is key to speedy algorithmic processing.

Researchers at Harvard University and Aarhus University in Denmark have validated the Johnson-Lindenstrauss lemma for reducing data dimensionality.

Credit: Harvard University

Harvard University professor Jelani Nelson and Kasper Green Larsen of Aarhus University in Denmark have validated the Johnson-Lindenstrauss lemma (JL lemma) for reducing data dimensionality.

"We have proven that there are 'hard' datasets for which dimensionality reduction beyond what's provided by the JL lemma is impossible," Nelson says.

The JL lemma demonstrates for any finite collection of points in high dimension, there is a collection of points in a lower dimension preserving all distances between the points. Scientists determined the theorem can act as a preprocessing step and reduce data dimensionality before running algorithms.

The theorem employs geometric classification to map the similarities between dimensions, retaining the geometry of the data and the angles between data points.

Tel Aviv University professor Noga Alon in Israel says Nelson and Larsen's work addresses "a logarithmic gap...between the upper and lower bounds for the minimum possible dimension required as a function of the number of points and the distortion allowed."

From Harvard University
View Full Article


Abstracts Copyright © 2017 Information Inc., Bethesda, Maryland, USA


No entries found