Artificial Intelligence and Machine Learning East Asia and Oceania Region Special Section: Hot Topics

Principles and Practices of Real-Time Feature Computing Platforms for ML

By Hao Zhang, Jun Yang, Cheng Chen, Siqi Wang, Jiashu Li, and Mian Lu

Posted Jul 1 2023

Article
References
Authors
Footnotes

Real-time feature computation, which calculates features from raw data on demand, is a crucial component in the machine learning (ML) application process. These real-time features are vital for various real-world ML applications, such as anti-fraud management, risk control, and personalized recommendations. In these cases, low latency (milliseconds) in computing fresh data features is crucial for accurate and high-quality online inference.

As illustrated in the accompanying figure, a data scientist typically begins an ML application by developing feature computation scripts (for example, using Python or SparkSQL) for offline training. However, these scripts cannot meet the demands of online serving, including low latency, high throughput, and high availability. Hence, it is necessary to transform these scripts into performance-optimized code (for example, using C++) that can be developed by an engineering team with system and production knowledge. This transformation process is time-consuming and requires significant development, deployment, and double system maintenance efforts.

Figure. Consistency verification between offline and online feature computing.

Moreover, the big gap in the software stacks, personnel domain knowledge, and performance concerns may cause the challenging feature inconsistency problem; for example, different interpretations of window boundary (that is, inclusive or exclusive), variations in handling empty values, or diverse operator definitions. As an example, consider fraud detection where the features include “account balance” and its standard deviation (std). During offline training, the model may use yesterday’s account balance while in online serving, the model is provided with the current account balance.¹ Moreover, the std logic may also differ, as shown in the figure. These variations in data definition and computing can result in significant differences in prediction outcomes (that is, training-serving skew). Consequently, data scientists and engineers must invest significant effort in iterative development and cooperative consistency verification to align the results. To resolve these challenges, new design methodologies and systems are necessary to eliminate the added cost of consistency verification and ensure low latency in on-demand feature computing.

OpenMLDB relieves the headache of verification by offering a unified execution engine and the same SQL APIs for both offline training and online serving.

In this article, we discuss the design principles and practices of tackling the challenges of real-time feature computing. Basically, we propose that such a feature computing platform should consist of three major components—a batch engine for offline training; a real-time engine for online serving; and a unified execution plan generator—to bridge those two engines to inherently guarantee the online-offline consistency. With such an architecture, a feature script developed by a data scientist can be deployed online without extra effort involved.

OpenMLDB^a is an open source feature computing platform practicing those design principles, which is initiated and led by Fourth Paradigm Southeast Asia. In particular, the unified SQL APIs and shared execution plan generator for both offline and online engines eliminate the transformation and consistency verification.

With optimization techniques such as in-memory time-travel structures, LLVM codegen, and pre-aggregation, the real-time engine of Open-MLDB offers low latency and can meet the demands of most online decision-making systems with an average latency of under 20ms. However, this is not achievable by other feature stores (for example, Feast,^b Hopsworks,^c Feathr^d) or traditional databases (for example, MySQL), which either provides only storage service for fast retrieval or cannot satisfy the latency requirement of on-demand feature computing. By taking advantage of the Persistent Memory Module (PMEM), OpenMLDB can further shorten the tail latency up to 19.7%, reduce the recovery time up to 99.7%, and save up to 58.4% of the total cost compared to the version of DRAM+SSD.² Moreover, OpenMLDB is considered for state-of-the-art federated learning systems.³ It has been integrated into Fourth Paradigm’s commercial products used by its customers (for example, Epitex), and also independently used by an increasing number of community users (for example, Akulaku) since it is open source.

OpenMLDB helped Akulaku save more than $500,000 per year in server and personnel costs.

Here, we examine a real-world use case from Akulaku (a fintech unicorn from Indonesia) to demonstrate how OpenMLDB helps the company’s products. The biggest challenge faced by Akulaku in its ML deployments is the consistency verification before new features are employed online. Debugging the cause of a training-serving skew was a time-consuming and labor-intensive process, taking up to a month of manpower and 50% of their total workload. Moreover, verification failures could lead to ineffective models or even serious production accidents. OpenMLDB relieves the headache of verification by offering a unified execution engine and the same SQL APIs for both offline training and online serving. It helps optimize their architecture by eliminating the various external tools (for example, Spark, Greenplum for the offline features and TiDB, PolarDB, Flink for the online features) in their original software stack. Reported by Akulaku, OpenMLDB helps them save more than $US500K per year in terms of server and personnel costs.

In addition to the users in Southeast Asia, OpenMLDB has also been adopted by users from other regions, such as Huawei, 37GAMES, UBiX, and so on. As feature engineering becomes a crucial component in ML applications, we believe that OpenMLDB will play a significant role in accelerating ML deployment and productization not only in Southeast Asia but worldwide.

Submit an Article to CACM

CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.

You Just Read

Principles and Practices of Real-Time Feature Computing Platforms for ML

View in the ACM Digital Library

Copyright held by authors/owners. Publication rights licensed to ACM.
Request permission to publish from permissions@acm.org

DOI

10.1145/3589224

July 2023 Issue

Published: July 1, 2023

Vol. 66 No. 7

Table of Contents

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Explore More

BLOG@CACM Apr 26 2024

Optimizing Energy Efficiency in Datacenters with Advanced Cooling Technologies

Alex Williams

Architecture and Hardware

Credit: Getty Images Servers in snowy setting.

News Apr 23 2024

Maximizing Power Grid Security

R. Colin Johnson

Security and Privacy

News Apr 18 2024

Keeping AI Out of Elections

Bennie Mols

Artificial Intelligence and Machine Learning

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More

Principles and Practices of Real-Time Feature Computing Platforms for ML

DOI

July 2023 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.