Research and Advances
Architecture and Hardware Research highlights

Technical Perspective: An Elegant Model for Deriving Equations

Posted
  1. Article
  2. References
  3. Author
  4. Footnotes
Read the related Research Paper
multiple signals, illustration

We have encountered units and elementary dimensional analysis in our high school science classes. For instance, the mass of an object is expressed in kilograms (kg). Likewise, length is expressed using meters (m) and time in seconds (s). Other physical quantities such as acceleration has dimensions m s-2 (derived from its definition), whereas force has dimensions kg m s-2. The latter arises from Newton's second law that states that force (F) is equal to the mass (m) times the acceleration (a). Thus, the dimensional units of quantities reflect important relationships between them.

Suppose we are onboard an aircraft with an array of sensors that are independently measuring, among other things, the values of force, mass and acceleration. We could use the equation F = ma to check, for instance, that a single sensor has not failed. However, in many cases, deriving such laws from "first principles" may be quite cumbersome, if not outright impossible.

Imagine a system running by a patient's bedside in the intensive care unit of a hospital with a continuous stream of data that includes the patient's blood pressure BP (kg m-1 s-2), lung volume V (m3), pulse P (s-1), and body weight W (kg). In this situation, it is unclear whether there are "precise" equations derivable from first principles, or even "approximate" empirical equations that may hold under some situations. Be they exact or approximate, these relationships are useful in numerous applications such as the run-time monitoring of safety critical systems.

Discovering possible relationships between various quantities given observational data suffers from the classic "needle in the haystack" problem. The number of possible hypotheses is astronomically large whereas, in practice, very few of these hypotheses will survive empirical tests. The following paper addresses the key problem of discovering relationships that hold between physical quantities from data using dimensional analysis to drastically narrow down the space of hypotheses.

Machine learning provides many powerful approaches for regression using neural network models to detect relationships between quantities. However, many of the existing approaches do not consider the dimensions of the quantities being modeled. The authors propose a simple, yet elegant approach based on the idea of dimensional analysis in physics: a powerful approach that can postulate possible physical relationships by examining the dimensions of the quantities being related. The "Buckingham π" theorem, which formalizes earlier methods going back to the 19th century, provides an elegant recipe for generating such relationships by finding dimensionless parameters. Using this, given the dimensions of the quantities measured, we may setup a system of linear equations to discover such products. For instance, F × m-1 × a-1 is seen to be dimensionless using this approach, from the dimensions of F, m, and a. Similarly, for the ICU bedside monitor described earlier, the quantity BP × V 1/3 × P-2 × W-1 is dimensionless. However, unlike Newton's second law, the relationship between blood pressure and pulse is much more complex and variable. Thus, suitable statistical tests on the data are used to further classify the relationships obtained from the inference approach presented in the paper.

The authors demonstrate their approach to effectively derive physical relationships from observational data for systems such as an unpowered glider and a pendulum. Their approach empirically discovers Newton's equations, which are then used to accurately predict the altitude of the glider or the familiar relationship between the length of the pendulum and its time of oscillation. A more sophisticated and general approach uses the derived dimensionless parameters as input features to train machine learning models on the observed data. This approach compares quite favorably to other off-the-shelf approaches.

Thus, the authors present an elegant approach to inferring models from data that incorporate some of the known relationships between the quantities being modeled using dimensional analysis. Elsewhere, dimensional analysis has been shown to be quite effective in detecting defects in robotic software using dimensions as type annotations that can be derived using program analysis techniques.2 Furthermore, dimensions provide a type system for physical quantities. Such type systems are quite useful in machine learning models wherein we often seek to avoid overfitting by imposing constraints such as monotonicity on the models.3 I see the proposed dimensional consistency approach as a precursor to strongly typed machine learning models that can leverage the power of dependent type systems to specify more sophisticated properties including monotonicity.1

    1. Clancy, K. and Miller, H. Monotonicity types for distributed dataflow. In Proceedings of the Programming Models and Languages for Distributed Computing. ACM, 2017.

    2. Ore, J-P.W. Dimensional Analysis of Robot Software without Developer Annotations. Ph.D. thesis, Univ. of Nebraska, Lincoln, 2019.

    3. Sill, J. Monotonic networks. Advances in Neural Information Processing Systems 10. M. Jordan, M. Kearns, and S. Solla, Eds. MIT Press, Cambridge, MA, 1998, 661–667

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More