Sign In

Communications of the ACM

ACM TechNews

Researchers Train Neural Network to Recognize Chemical Formulas From Research Papers

drawing of a molecule

Researchers created a comprehensive data generator that stochastically simulates various drawing styles to train their recognition model.

Scientists at Russia's Syntelly automation startup, Lomonosov Moscow State University, and the Sirius University of Science and Technology have educated neural networks to to convert images of organic structures to molecular structures. automatically identify chemical formulas in research papers.

The researchers used Google's Transformer machine translation neural network to convert images of molecules or molecular templates into textual representations named Functional-Group-SMILES. The network was able to learn what it was provided, as long as the pertinent depiction style was represented in the training data.

The researchers also designed a data generator to produce examples of molecular templates by blending randomly chosen molecule fragments and depiction styles. The work is published in the journal Chemistry-Methods.

"Our study is a good demonstration of the ongoing paradigm shift in the optical recognition of chemical structures," says Sergey Sosnin, CEO of Syntelly, which was founded at Skoltech.

From Skolkovo Institute of Science and Technology
View Full Article


Abstracts Copyright © 2022 SmithBucklin, Washington, DC, USA



No entries found