September 1968 - Vol. 11 No. 9

September 1968 issue cover image

Features

Research and Advances

Information Retrieval: PEEKABIT, computer offspring of punched card PEEKABOO, for natural language searching

The “peekaboo” idea from punched card information retrieval methods has been mated with the idea of superimposed punching to produce a programming technique which cuts computer run time in half on a test search of 33,000 subject index entries. A search program using the device has been operational since late 1963. As an item is entered in the store, an 18-byte mask is created from the item's meaningful words using the inclusive OR operation. If, at search time, the logical product (using the AND operation) of this mask and a similarly constructed question mask is not equal to the question mask, then one or more question words are not present in the store item. An equality is inconclusive; the words of the store item must be unpacked and compared with question words. The present store is made up of over 600,000 subject index entries estimated to average 60 characters each. Longer texts, such as abstracts, could be handled by multiple masks.
Research and Advances

Experimental evaluation of information retrieval through a teletypewriter

Experiments designed to evaluate the capabilities of mechanized information retrieval systems, with emphasis on interactive (man-machine) language and on some of the mechanical and psychological limitations in their design, were conducted at the Moore School Information Systems Laboratory. The basic assumption of the research is that an information retrieval system that provides for man-machine dialogue at a remote inquiry terminal should provide a searcher with many of the tools which would be available to him were he actually performing his search at a library or repository of documents. Factors involved in evaluation of such a system include ease of use, learning time, and effectiveness of actual retrieval. Three experiments and the conclusions resulting from them are detailed.
Research and Advances

Operating Systems: A statistical model for console behavior in multiuser computers

The ability of a computer system to communicate with the outside world efficiently is as important as its ability to perform computations efficiently. It is quite difficult to characterize a particular user, but rather easy to characterize the entire user community. Based on the properties of this community we have postulated a hypothetical “virtual console.” No claim is made that a virtual console behaves like any actual console, but the entire collection of virtual consoles models the collection of actual consoles. Using the model we answer questions like: How many processes are suspended waiting for console input? What is the maximum rate at which a process can execute? What bounds can be set on overall buffer requirements? Answers to these and similar questions are needed in certain aspects of operating system design.
Research and Advances

Computational Linguistics: Graphical input/output of nonstandard characters

A system developed at Harvard for graphically inputting and outputting nonstandard characters on a computer is printed. In principle, the system can deal with any orthography, although at present it is limited to 4000 Chinese characters and some mathematical symbols. New characters can be added to the repertoire of the system by graphical input on a display scope. Text inputting is accomplished via a display scope or a Rand Tablet. The organization and operation of the current system are described, and a discussion of the relative merits of such a system is given. Illustrations of the computer input and output of Chinese characters are included.
Research and Advances

Scientific Applications: An algorithm for identifying the ergodic subchains and transient states of a stochastic matrix

An algorithm for identifying the ergodic subchains and transient states of a stochastic matrix is presented. Applications in Markov renewal programming and in the construction of variable length codes are reviewed, and an updating procedure for dealing with certain sequences of stochastic matrices is discussed. Computation times are investigated experimentally and compared with those of another recently proposed method.
Research and Advances

PLEXUS—an on-line system for modeling neural networks

A description is presented of PLEXUS, a system which enables a user to construct and specify a neural network, to analyze the output data produced by the network, and to store and retrieve networks and data from a library. The system, operated entirely from a digital display unit, interacts directly with the user and permits easy and rapid transitions between the various phases of the modeling process. PLEXUS is designed to complement neurophysiological research so that the systematic development of neural models can be coordinated with experimental work. PLEXUS networks are built up from components representing individual neurons, external stimuli, and interconnecting fibers, each component being of a relatively detailed nature. Provision is also made for the use of experimental data as input to a network. Convenient means for specification and modification of a network and extensive error-checking capabilities are provided. Data resulting from the simulation of a network may be analyzed by a variety of techniques ranging from examinations of the gross characteristics of the data to the determination of detailed statistical properties.
Research and Advances

Programming Languages: GPL, a truly general purpose language

A truly general purpose programming language, GPL, is described which contains facilities for constructing (within the language) new data types as well as facilities for operations performed upon them. The basic language is minimal in the sense that no basic element can be derived from the others with high efficiency in the object programs. Constructs like the ALGOL 60 for-statements, and if-statements are not basic; they are special types of procedures. New “symbols” (underlined words in ALGOL 60) are implicitly defined by usage in other declarations. As part words are definable, packed words are handled as easily as full words. “Address” variables (pointers) are included in full generality.
Research and Advances

A comparison of the correlational behavior of random number generators for the IBM 360

Hutchinson states that the “new” (prime modulo) multiplicative congruential pseudorandom generator, attributed to D.H. Lehmer, has passed the usual statistical tests for random number generators. It is here empirically shown that generators of this type can produce sequences whose autocorrelation functions up to lag 50 exhibit evidence of nonrandomness for many multiplicative constants. An alternative generator proposed by Tausworthe, which uses irreducible polynomials over the field of characteristic two, is shown to be free from this defect. The applicability of these two generators to the IBM 360 is then discussed. Since computer word size can affect a generator's statistical behavior, the older mixed and simple congruential generators, although extensively tested on computers having 36 or more bits per word, may not be optimum generators for the IBM 360.

Recent Issues

  1. December 2024 CACM cover
    December 2024 Vol. 67 No. 12
  2. November 2024 CACM cover
    November 2024 Vol. 67 No. 11
  3. October 2024 CACM cover
    October 2024 Vol. 67 No. 10
  4. September 2024 CACM cover
    September 2024 Vol. 67 No. 9