In hierarchical search the data structure holding the file keys is partitioned into substructures of the same type; these are searched consecutively until the queried key is found or the substructures are exhausted. The interest here is in the conditions under which the performance of a hierarchical organization of static files is superior to that of the nonhierarchical organization and in the construction of the hierarchy when these conditions are met. The performance criterion is the average number of comparisons in a successful search, where averaging extends over all keys and over all permutations of the key's access probabilities. General properties of hierarchical search are first derived, and attention is then focused on the hierarchical binary organization—the special case where each of the data substructures is a sorted array (or a balanced binary tree) and where the keys are accessed by binary search. It is shown that an advantageous two-stage hierarchy is always implementable when the keys' access density function &phgr;(i) is “steeper” than Zipf's density function &zgr;(i)—the steeper it is, the greater the advantage. A simple method for constructing the two-stage hierarchy is formulated, based on finding the intersection of &phgr;(i) and &zgr;(i). For the r-stage hierarchical organization, partitioning procedures are proposed which are based on the iterative application of the two-stage techniques.
The Latest from CACM
Shape the Future of Computing
ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.
Get InvolvedCommunications of the ACM (CACM) is now a fully Open Access publication.
By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.
Learn More
Join the Discussion (0)
Become a Member or Sign In to Post a Comment