The choice of reference points in best-match file searching
Improvements to the exhaustive search method of best-match file searching have previously been achieved by doing a preprocessing step involving the calculation of distances from a reference point. This paper discusses the proper choice of reference points and extends the previous algorithm to use more than one reference point. It is shown that reference points should be located outside of data clusters. The results of computer simulations are presented which show that large improvements can be achieved by the proper choice and location of multiple reference points.