March 1977 - Vol. 20 No. 3
Features
Improving the access time for random access files
Clustering in the key set is decreased by smoothing the key-to-address transformation, and by adding shadow buckets to an open chaining file. The keys are pre-hashed before the address division, to remove the effect of sequential properties in the key set. Shadow buckets in the key search sequence reduce the effect of nonuniformity in file loading, and decrease the number of maximum probes needed to locate a record. The combined effects of these techniques lead to improved file performance for secondary storage devices, as shown by empirical studies.
Effective information retrieval using term accuracy
The performance of information retrieval systems can be evaluated in a number of different ways. Much of the published evaluation work is based on measuring the retrieval performance of an average user query. Unfortunately, formal proofs are difficult to construct for the average case. In the present study, retrieval evaluation is based on optimizing the performance of a specific user query. The concept of query term accuracy is introduced as the probability of occurrence of a query term in the documents relevant to that query. By relating term accuracy to the frequency of occurrence of the term in the documents of a collection it is possible to give formal proofs of the effectiveness with respect to a given user query of a number of automatic indexing systems that have been used successfully in experimental situations. Among these are inverse document frequency weighting, thesaurus construction, and phrase generation.
Empirical evaluation of some features of instruction set processor architectures
This paper presents methods for empirical evaluation of features of Instruction Set Processors (ISPs). ISP features are evaluated in terms of the time used or saved by having or not having the feature. The methods are based on analysis of traces of program executions. The concept of a register life is introduced, and used to answer questions like: How many registers are used simultaneously? How many would be sufficient all of the time? Most of the time? What would the overhead be if the number of registers were reduced? What are registers used for during their lives? The paper also discusses the problem of detecting desirable but non-existing instructions. Other problems are briefly discussed. Experimental results are presented, obtained by analyzing 41 programs running on the DECsystem10 ISP.
Memory management and response time
This paper presents a computationally tractable methodology for including accurately the effects of finite memory size and workload memory requirements in queueing network models of computer systems. Empirical analyses and analytic studies based on applying this methodology to an actual multiaccess interactive system are reported. Relations between workload variables such as memory requirement distribution and job swap time, and performance measures such as response time and memory utilization are graphically displayed. A multiphase, analytically soluble model is proposed as being broadly applicable to the analysis of interactive computer systems which use nonpaged memories.
Representation of many-sided polygons and polygonal lines for rapid processing
A representation for polygons and polygonal lines is described which allows sets of consecutive sides to be collectively examined. The set of sides are arranged in a binary tree hierarchy by inclusion. A fast algorithm for testing the inclusion of a point in a many-sided polygon is given. The speed of the algorithm is discussed for both ideal and practical examples. It is shown that the points of intersection of two polygonal lines can be located by what is essentially a binary tree search. The algorithm and a practical example are discussed. The representation overcomes many of the disadvantages associated with the various fixed-grid methods for representing curves and regions.
Operations on sparse relations
Various computations on relations, Boolean matrices, or directed graphs, such as the computation of precedence relations for a context-free grammar, can be done by a practical algorithm that is asymptotically faster than those in common use. For example, how to compute operator precedence or Wirth-Weber precedence relations in O(n2) steps is shown, as well as how to compute linear precedence functions in O(n) steps, where n is the size of a grammar. The heart of the algorithms is a general theorem giving sufficient conditions under which an expression whose operands are sparse relations and whose operators are composition, transitive closure, union, and inverse, can be computed efficiently.
Effects of chargeout on user/manager attitutes
The relationship of internal pricing systems for computer services (chargeout systems) and user management attitudes about their computer-based information systems is investigated. Evidence is provided that the relationship conforms to a general pattern that would be expected from the hypothesis of the four stages of EDP growth [15]. The results also indicate that the chargeout systems characteristic of advanced EDP stage environments are associated with relatively high levels of positive user attitudes and marked increases in EDP training for users. Both factors are important to the user/manager involvement necessary for effective control of computer-based systems. Development and maintenance of computer-based systems is asserted to be a category of organizational change. A “felt need” for the change on the part of the user/manager is pre-requisite to any change taking place. The research methods of behavioral science are applied to investigate the user/manager environment and the effects of charge-out.
Cost/utilization: a measure of system performance
A method is presented for evaluating computer system performance in terms of a cost/utilization factor and a measure of imbalance. These coefficients indicate the extent to which the total system cost is effectively utilized. The method includes a technique for the visual representation of system performance.
A comparison of next-fit, first-fit, and best-fit
“Next-fit” allocation differs from first-fit in that a first-fit allocator commences its search for free space at a fixed end of memory, whereas a next-fit allocator commences its search wherever it previously stopped searching. This strategy is called “modified first-fit” by Shore [2] and is significantly faster than the first-fit allocator. To evaluate the relative efficiency of next-fit (as well as to confirm Shore's results) a simulation was written in Basic Plus on the PDP-11, using doubly linked lists to emulate the memory structure of the simulated computer. The simulation was designed to perform essentially in the manner described in [2]. The results of the simulation of the three methods show that the efficiency of next-fit is decidedly inferior to first-fit and best-fit when the mean size of the block requested is less than about 1/16 the total memory available. Beyond this point all three allocation schemes have similar efficiencies.