The Evolution of APL, the HOPL I paper by Falkoff and Iverson on APL, recounted the fundamental design principles which shaped the implementation of the APL language in 1966, and the early uses and other influences which shaped its first decade of enhancements.
In the 40 years that have elapsed since HOPL I, several dozen APL implementations have come and gone. In the first decade or two, interpreters were typically born and buried along with the hardware or operating system that they were created for. More recently, the use of C as an implementation language provided APL interpreters with greater longevity and portability.
APL started its life on IBM mainframes which were time-shared by multiple users. As the demand for computing resources grew and costs dropped, APL first moved in-house to mainframes, then to mini- and micro-computers. Today, APL runs on PCs and tablets, Apples and Raspberry Pis, smartphones and watches.
The operating systems, and the software application platforms that APL runs on, have evolved beyond recognition. Tools like database systems have taken over many of the tasks that were initially implemented in APL or provided by the APL system, and new capabilities like parallel hardware have also changed the focus of design and implementation efforts through the years.
The first wave of significant language enhancements occurred shortly after HOPL I, resulting in so-called second-generation APL systems. The most important feature of the second generation is the addition of general arrays—in which any item of an array can be another array—and a number of new functions and operators aligned with, if not always motivated by, the new data structures.
The majority of implementations followed IBM’s path with APL2 “floating” arrays; others aligned themselves with SHARP APL and “grounded” arrays. While the APL2 style of APL interpreters came to dominate the mainstream of the APL community, two new cousins of APL descended from the SHARP APL family tree: J (created by Iverson and Hui) and k (created by Arthur Whitney).
We attempt to follow a reasonable number of threads through the last 40 years, to identify the most important factors that have shaped the evolution of APL. We will discuss the details of what we believe are the most significant language features that made it through the occasionally unnatural selection imposed by the loss of habitats that disappeared with hardware, software platforms, and business models.
The history of APL now spans six decades. It is still the case, as Falkoff and Iverson remarked at the end of the HOPL I paper, that:
Although this is not the place to discuss the future, it should be remarked that the evolution of APL is far from finished.2020-06-12
Corporations today collect data at an unprecedented and accelerating scale, making the need to run queries on large datasets increasingly important. Technologies such as columnar block-based data organization and compression have become standard practice in most commercial database systems. However, the problem of best assigning records to data blocks on storage is still open. For example, today's systems usually partition data by arrival time into row groups, or range/hash partition the data based on selected fields. For a given workload, however, such techniques are unable to optimize for the important metric of the number of blocks accessed by a query. This metric directly relates to the I/O cost, and therefore performance, of most analytical queries. Further, they are unable to exploit additional available storage to drive this metric down further. In this paper, we propose a new framework called a query-data routing tree, or qd-tree, to address this problem, and propose two algorithms for their construction based on greedy and deep reinforcement learning techniques. Experiments over benchmark and real workloads show that a qd-tree can provide physical speedups of more than an order of magnitude compared to current blocking schemes, and can reach within 2X of the lower bound for data skipping based on selectivity, while providing complete semantic descriptions of created blocks.2020-06-11
The DBSCAN method for spatial clustering has received significant attention due to its applicability in a variety of data analysis tasks. There are fast sequential algorithms for DBSCAN in Euclidean space that take O(nłog n) work for two dimensions, sub-quadratic work for three or more dimensions, and can be computed approximately in linear work for any constant number of dimensions. However, existing parallel DBSCAN algorithms require quadratic work in the worst case. This paper bridges the gap between theory and practice of parallel DBSCAN by presenting new parallel algorithms for Euclidean exact DBSCAN and approximate DBSCAN that match the work bounds of their sequential counterparts, and are highly parallel (polylogarithmic depth). We present implementations of our algorithms along with optimizations that improve their practical performance. We perform a comprehensive experimental evaluation of our algorithms on a variety of datasets and parameter settings. Our experiments on a 36-core machine with two-way hyper-threading show that our implementations outperform existing parallel implementations by up to several orders of magnitude, and achieve speedups of up to 33x over the best sequential algorithms.2020-06-11
Locality-Sensitive Hashing (LSH) is one of the most popular methods for c-Approximate Nearest Neighbor Search (c-ANNS) in high-dimensional spaces. In this paper, we propose a novel LSH scheme based on the Longest Circular Co-Substring (LCCS) search framework (LCCS-LSH) with a theoretical guarantee. We introduce a novel concept of LCCS and a new data structure named Circular Shift Array (CSA) for k-LCCS search. The insight of LCCS search framework is that close data objects will have a longer LCCS than the far-apart ones with high probability. LCCS-LSH is LSH-family-independent, and it supports c-ANNS with different kinds of distance metrics. We also introduce a multi-probe version of LCCS-LSH and conduct extensive experiments over five real-life datasets. The experimental results demonstrate that LCCS-LSH outperforms state-of-the-art LSH schemes.2020-06-11
Posit is a recently proposed alternative to the floating point representation (FP). It provides tapered accuracy. Given a fixed number of bits, the posit representation can provide better precision for some numbers compared to FP, which has generated significant interest in numerous domains. Being a representation with tapered accuracy, it can introduce high rounding errors for numbers outside the above golden zone. Programmers currently lack tools to detect and debug errors while programming with posits.
This paper presents PositDebug, a compile-time instrumentation that performs shadow execution with high precision values to detect various errors in computation using posits. To assist the programmer in debugging the reported error, PositDebug also provides directed acyclic graphs of instructions, which are likely responsible for the error. A contribution of this paper is the design of the metadata per memory location for shadow execution that enables productive debugging of errors with long-running programs. We have used PositDebug to detect and debug errors in various numerical applications written using posits. To demonstrate that these ideas are applicable even for FP programs, we have built a shadow execution framework for FP programs that is an order of magnitude faster than Herbgrind.2020-06-11
In spatial query processing, the popular index R-tree may incur large storage consumption and high IO cost. Inspired by the recent learned index  that replaces B-tree with machine learning models, we study an analogy problem for spatial data. We propose a novel Learned Index structure for Spatial dAta (LISA for short). Its core idea is to use machine learning models, through several steps, to generate searchable data layout in disk pages for an arbitrary spatial dataset. In particular, LISA consists of a mapping function that maps spatial keys (points) into 1-dimensional mapped values, a learned shard prediction function that partitions the mapped space into shards, and a series of local models that organize shards into pages. Based on LISA, a range query algorithm is designed, followed by a lattice regression model that enables us to convert a KNN query to range queries. Algorithms are also designed for LISA to handle data updates. Extensive experiments demonstrate that LISA clearly outperforms R-tree and other alternatives in terms of storage consumption and IO cost for queries. Moreover, LISA can handle data insertions and deletions efficiently.2020-06-11
Similarity search is an important and challenging problem that is typically modeled as nearest neighbor search in high dimensional space, where objects are represented as high dimensional vectors and their (dis)similarity is evaluated using a distance measure such as the Euclidean distance.2020-06-11
Given an integer array A[1..n], the Range Minimum Query problem (RMQ) asks to preprocess A into a data structure, supporting RMQ queries: given a,b∈ [1,n], return the index i∈[a,b] that minimizes A[i], i.e., argmini∈[a,b] A[i]. This problem has a classic solution using O(n) space and O(1) query time by Gabow, Bentley, Tarjan (STOC, 1984) and Harel, Tarjan (SICOMP, 1984). The best known data structure by Fischer, Heun (SICOMP, 2011) and Navarro, Sadakane (TALG, 2014) uses 2n+n/(logn/t)t+Õ(n3/4) bits and answers queries in O(t) time, assuming the word-size is w=Θ(logn). In particular, it uses 2n+n/polylogn bits of space as long as the query time is a constant.
In this paper, we prove the first lower bound for this problem, showing that 2n+n/polylogn space is necessary for constant query time. In general, we show that if the data structure has query time O(t), then it must use at least 2n+n/(logn)Õ(t2) space, in the cell-probe model with word-size w=Θ(logn).2020-06-08
We address the problem of performing efficient spatial and topological queries on large tetrahedral meshes with arbitrary topology and complex boundaries. Such meshes arise in several application domains, such as 3D Geographic Information Systems (GISs), scientific visualization, and finite element analysis. To this aim, we propose Tetrahedral trees, a family of spatial indexes based on a nested space subdivision (an octree or a kD-tree) and defined by several different subdivision criteria. We provide efficient algorithms for spatial and topological queries on Tetrahedral trees and compare to state-of-the-art approaches. Our results indicate that Tetrahedral trees are an improvement over R*-trees for querying tetrahedral meshes; they are more compact, faster in many queries, and stable at variations of construction thresholds. They also support spatial queries on more general domains than topological data structures, which explicitly encode adjacency information for efficient navigation but have difficulties with domains with a non-trivial geometric or topological shape.2020-06-03