Arthur Gill

Research and Advances May 1 1980

In hierarchical search the data structure holding the file keys is partitioned into substructures of the same type; these are searched consecutively until the queried key is found or the substructures are exhausted. The interest here is in the conditions under which the performance of a hierarchical organization of static files is superior to that of the nonhierarchical organization and in the construction of the hierarchy when these conditions are met. The performance criterion is the average number of comparisons in a successful search, where averaging extends over all keys and over all permutations of the key's access probabilities. General properties of hierarchical search are first derived, and attention is then focused on the hierarchical binary organization—the special case where each of the data substructures is a sorted array (or a balanced binary tree) and where the keys are accessed by binary search. It is shown that an advantageous two-stage hierarchy is always implementable when the keys' access density function &phgr;(i) is “steeper” than Zipf's density function &zgr;(i)—the steeper it is, the greater the advantage. A simple method for constructing the two-stage hierarchy is formulated, based on finding the intersection of &phgr;(i) and &zgr;(i). For the r-stage hierarchical organization, partitioning procedures are proposed which are based on the iterative application of the two-stage techniques.

Author Archives

Shape the Future of Computing