Sign In

Communications of the ACM

ACM TechNews

Supercomputing Speeds Up Deep Learning Training


Two-dimensional embedding of images from the ImageNet database, extracted by a convolutional neural network using Caffe.

Researchers recently published the results of an effort to use supercomputers to train a deep neural network for rapid image recognition.

Credit: Andrej Karpathy

A team led by researchers at the Texas Advanced Computing Center (TACC) recently published the results of an effort to use supercomputers to train a deep neural network for rapid image recognition.

They used the Stampede2 supercomputer to complete a 100-epoch ImageNet training with AlexNet in 11 minutes, marking the fastest time recorded to date.

The team also completed a 90-epoch ImageNet training with ResNet-50 in 32 minutes.

"These results show the potential of using advanced computing resources...along with large mini-batch enabling algorithms, to train deep neural networks interactively and in a distributed way," says TACC's Zhao Zhang.

The research involved developing a Layer-Wise Adaptive Rate Scaling algorithm that distributes data efficiently to many processors to compute simultaneously using a batch size of up to 32,000 items.

"By not having to migrate large datasets between specialized hardware systems, the time to data-driven discovery is reduced and overall efficiency can be significantly increased," says TACC's Niall Gaffney.

From Texas Advanced Computing Center
View Full Article

 

Abstracts Copyright © 2017 Information Inc., Bethesda, Maryland, USA


 

No entries found