Research and Advances

A cost oriented algorithm for data set allocation in storage hierarchies

Data set allocation in today's multilevel storage systems is usually based on qualitative, ad hoc decisions. While it would be desirable to obtain an optimal solution to this allocation problem, it is clear that the number of parameters involved makes it intractable to straight-forward solution. In such a situation, we must find a set of assumptions which simplify the problem greatly, but which still provide a basis for considering all significant cost elements. This paper presents such a first, quantitative allocation step. It considers many of the significant detailed costs of system utilization, data storage, data staging, and data migration. Although many avenues of further improvement are available, the present algorithm seems to be usefully accurate. As such, it can aid in quantifying the problems of data set allocation, storage system configuration, and new device designs.

Advertisement

Author Archives

Research and Advances

General performance analysis of key-to-address transformation methods using an abstract file concept

This paper presents a new approach to the analysis of performance of the various key-to-address transformation methods. In this approach the keys in a file are assumed to have been selected from the key space according to a certain probabilistic selection algorithm. All files with the same number of keys selected from this key space will be suitably weighted in accordance with the algorithm, and the average performance of the transformation methods on these files will be used as the potential of these methods. Using this analysis, methods with the same overall performance can be classified and key distributions partial to certain transformations can be identified. All this can be done analytically. The approach is applied to a group of transformation methods using files whose keys are selected randomly.
Research and Advances

Additional results on key-to-address transform techniques: a fundamental performance study on large existing formatted files

In an earlier paper by Lum, Yuen, and Dodd [1] experimental results comparing six commonly used key-to-address transformation techniques were presented. One transformation in that study referred to as “Lin's method” is an elaborate technique based on radix transformation. Andrew Lin has since pointed out to the authors that his method of transformation [2] consists of not just a radix transformation algorithm but also the specific ways the values of p and q are chosen as well as hardware implementation to carry out the steps of this transformation in an efficient manner. Since our study was intended for general radix transformations rather than Lin's specific implementation, we think it is more appropriate to change the label of that transformation in [1] from “Lin's method” to “generalized radix transformation method” and we use this term here.
Research and Advances

Multi-attribute retrieval with combined indexes

In this paper a file organization scheme designed to replace the use of the popular secondary index filing scheme (or inverted files on secondary key fields) is described. Through the use of redundancy and storing keys (or access numbers of the records) that satisfy different combinations of secondary index values in “buckets,” it is possible to retrieve all keys satisfying any input query derived from a subset of fields by a single access to an index file, although each bucket may be used for many combinations of values and a combination of buckets may be required for a given query. The method which, in its degenerate case, becomes the conventional secondary index filing scheme works similarly but has the following advantages: (1) the elimination of multiple accesses in many cases; (2) the elimination of false drops; (3) the elimination of computer time to perform intersection of key sets each qualified for one secondary index field only; and (4) the avoidance of long strings of keys when an index field appearing in a query has very few possible values. Redundancy, in some cases, is the same as the secondary indexing method. In the general case, trade-off between the number of accesses for query and redundancy exists.

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved