Artificial Intelligence and Machine Learning

AI for Drug Design

Machine learning can speed traditional drug development.
Some recent breakthroughs in drug discovery have come about thanks to the use of artificial intelligence.

Traditional drug development is slow and expensive. It often takes more than 10 years for a new medicine to come to market, and it can cost up to $2.6 million. In the past few years, however, there has been a growing interest in using machine learning to help with the process.

"The idea is that you can screen billions of molecules on a computer and identify some which look promising, and then you just manufacture and test the small subset," says Regina Barzilay, Delta Electronics Professor of the Massachusetts Institute of Technology (MIT) Department of Electrical Engineering and Computer Science, and a member of the university's Computer Science and Artificial Intelligence Lab.

The vast size of chemical space is one of the main challenges when it comes to finding new drugs. Medicinal chemists look for new small molecules, and there could be up to 1 novemdecillion (1 followed by 60 zeros) of them, according to the American Chemical Society, more than some estimates of the number of stars in the universe. Although researchers have zeroed in on millions of these compounds through traditional methods, the number that have been synthesized and tested as drugs is thought to represent less than 0.1% of the potential drugs that exist. "The machine learning community identified it as an important area where we can contribute," says Barzilay.

There have been some recent breakthroughs in drug discovery, thanks to artificial intelligence (AI). In recent work, Barzilay and her colleagues used a deep learning system to discover a new antibiotic, which is a first. The newly discovered medicine proved effective against a wide range of bacteria in tests on mice, including tuberculosis and bacteria strains that have demonstrated resistance to current antibiotics.

Barzilay and her team decided to focus on antibiotics since a lack of new antibiotics is creating a growing health crisis. Existing antibiotics are no longer effective against many infections, as bacteria have grown resistant. Just eight new antibiotics with limited effectiveness have been approved since July 2017, according to a recent report by the World Health Organization.

To tackle the problem, the researchers developed a deep learning convolutional neural network (CNN) that can predict the antibiotic properties of new compounds. It was first trained to recognize molecules that inhibit the growth of E. coli bacteria by feeding it a collection of about 2,500 molecules whose antibacterial capabilities were known. Then, the system was presented with a library, called the Drug Repurposing Hub, containing over 6,000 molecules identified as potentially interesting to fight various human diseases.  It was asked to predict which molecules are both active against E. coli and had different structures from existing antibiotics.

One result was the new antibiotic halicin (named for the intelligent computer HAL in the movie 2001: A Space Odyssey). The medication  was being investigated as a potential treatment for diabetes.

When the deep learning model was applied to another chemical library with over 107 million molecules, it came up with a further eight candidates. "We identified quite a few very strong molecules that show promising activity," says Barzilay.

However, Barzilay stresses that the key to their success was working with a team of biology experts who could verify how halicin worked. Since AI doesn't have any knowledge of biology, the compound had to be verified to see whether it would still work if a bacterium developed a resistant strain, for example. "There was a lot of testing on the biology side to demonstrate that it's doing non-trivial things," says Barzilay.

Another challenge in developing a drug is optimizing its properties. Even though a compound may be effective against a disease, it must also meet other criteria, such as having low toxicity and being water-soluble, for example. There are usually between 10 and 20 such objectives that need to be met for a compound to get the green light, says Adam Skiredj, chemistry and business development manager at Iktos, a company developing AI technology for drug development in Paris, France. Skiredj said objectives often are interrelated to the extent that when you improve on one of them, it results in a loss for another. "It's like a very complex Rubik's cube to solve," he says.

Skiredj said Iktos has developed deep learning algorithms that can be combined to identify compounds fulfilling multiple criteria. Their AI system identifies viable compounds by learning chemical rules from large amounts of public data: for instance, there are a limited number of bonds that carbon and nitrogen can form with other atoms. Then, using data specific to a given project on potential active molecules, models are built to predict how they will perform on each individual objective. The system then uses a fitness function developed from the models to generate compounds that best meet all the criteria. "This means that you are generating a compound which fits in terms of chemical structures and also (the biological and chemical) objectives of the project," says Skiredj.

Iktos recently applied its system to a drug discovery project with 11 objectives, and was able to generate a list of promising compounds in just three weeks. Using a dataset of 880 molecules whose pharmacological activity had been tested, the best molecule identified by medicinal chemists scored well on nine of the 11 objectives. On the other hand, the deep learning algorithm generated a final shortlist of 11 promising compounds that were then tested; one was found to meet all the objectives, while two others scored highly on 10 criteria. "That's one of our success stories," says Skiredj.

The company has also just developed an AI-driven system that can help medicinal chemists synthesize new molecules. Discovering a molecule is only part of the solution, since they then have to be created. The new system "uses AI and data to come up with the recipes of these molecules, just like cooking," says Skiredj. "It's a pain point for chemists and something that our clients were asking for."

The role of AI in drug discovery has been controversial, since that role has not always been clear. In one case, a 'new' molecule found with deep learning was actually similar to the structure of an existing drug.  Also, training data for the AI  is chosen by medicinal chemists, so "Unless we know what exactly the machine learning algorithm did and what a human expert has done, we really cannot appreciate the contribution," says Barzilay.

However, deep learning is likely to find its place in the field and to be used increasingly. Barzilay thinks the use of deep learning use as a drug discovery tool will become commonplace in the long run, and that we will soon see a commercially available drug for which AI was involved in its discovery.

Ultimately, the way that drug discovery is done, and the steps currently used, may be completely transformed. "I think the next frontier is really to change the whole process with the capacities of machine learning," says Barzilay. "I'm a firm believer that we're going to see that soon as well."

Sandrine Ceurstemont is a freelance science writer based in London, U.K.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More