ACM Careers
# RIT Researchers Create Math-Aware Search Interface

Researchers at Rochester Institute of Technology have developed MathDeck, an online search interface that allows anyone to easily create, edit, and look up sophisticated math formulas from a computer.

Created by an interdisciplinary team of more than a dozen faculty and students, MathDeck aims to make math notation interactive and easily shareable, rather than an obstacle to mathematical study and exploration. The math-aware search interface is free to the public and available to use at mathdeck.cs.rit.edu.

Researchers said the project stems from a growing public interest in being able to do web searches with math keywords and formulas. However, for many people, it can be difficult to accurately express sophisticated math without an understanding of the scientific markup language LaTeX.

With MathDeck, users can now enter and edit formulas in multiple ways, including handwriting, uploading a typeset formula image and text input using LaTeX. Using image processing and machine learning techniques, the interface is able to recognize formula images and hand-drawn symbols.

"With such a tool in hand, it will be much easier for experts and non-experts to enter complicated formulas and symbols accurately and have the search engines find mathematically relevant answers quickly and effectively," says Anurag Agarwal, associate professor in RIT's School of Mathematical Sciences. "It can also help people from different disciplines to collaborate, share their findings, and perform searches more productively."

MathDeck is one piece of a larger project called MathSeer, which is supported by nearly $1 million in funding from the U.S. National Science Foundation and the Alfred P. Sloan Foundation. MathSeer is led by Richard Zanibbi, professor of computer science at RIT, Agarwal, Penn State University Professor C. Lee Giles, and University of Maryland, College Park Professor Douglas W. Oard.

"The goal of MathSeer is to produce new technologies to provide 'math search for the masses,'" says Zanibbi, who is also director of RIT's Document and Pattern Recognition Lab in the Golisano College of Computing and Information Sciences. "This involves creating new search interfaces, AI algorithms for handwritten and image input, and search engine technologies that better support formulas in queries."

In order to create a useful interface for MathDeck, the team had to better understand user's search behavior, including how users express their query and what types of documents they are looking for. They also noted that in mathematics, expressions and symbols often have multiple meanings and contexts.

"To tackle these complexities, we used our knowledge and expertise in math to make the system 'aware' of the mathematical nuances, so that it can interpret and represent the mathematical connection between the various objects in formulas with high accuracy, thereby resulting in effective search," Agarwal says.

The interface will also help users save time, because they can save their sessions and favorite formulas. Users can manipulate and save formulas so they don't have to re-enter the formula.

"Entering math formulas is a big challenge from the user's perspective, as math is typically expressed in a two-dimensional space, while typing only produces a sequence of characters," says Gavin Nishizawa, a computer science master's student who was lead developer on the project.

MathDeck includes an auto-complete function for formulas and keywords. If users are searching for a popular symbol or formula, they'll likely find an entity card. The card shows the formula, the name of its associated concept, and a brief description.

"In formula search, there are math-specific challenges, including 'equivalent' formulas with different variable names or terms in another order," says Nishizawa, who also completed a software engineering degree at RIT in 2018. "For formula autocomplete, MathDeck searches entity cards by recognizing a formula's structure, passing its structure representation into a neural network, and then producing an embedding vector that is compared against formulas in the entity cards."

When it comes time to submit a query, users can select from 11 search engines, including standard search engines like Google, and more math-focused systems, including Wolfram Alpha and Math Stack Exchange.

Zanibbi says the team plans to extend MathDeck in the future. They are creating techniques to make formulas searchable in large PDF collections and working to improve formula and text search, as well as improving formula recognition in handwriting and images.

Zanibbi, Agarwal, Oard, and RIT computing and information sciences Ph.D. student Behrooz Mansouri are also running ARQMath, an international task to benchmark and improve math-aware search technologies.

"There is a lot of complexity around math, so making the use of math more intuitive can help address many problems in math and science," Nishizawa says. "Research in this area can have a significant positive impact on things like math literacy, understanding mathematical ideas, and improving people's quality of life."

No entries found