In a document retrieval, or other pattern matching environment where stored entities (documents) are compared with each other or with incoming patterns (search requests), it appears that the best indexing (property) space is one where each entity lies as far away from the others as possible; in these circumstances the value of an indexing system may be expressible as a function of the density of the object space; in particular, retrieval performance may correlate inversely with space density. An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents. Typical evaluation results are shown, demonstating the usefulness of the model.
The Latest from CACM
Shape the Future of Computing
ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.
Get InvolvedCommunications of the ACM (CACM) is now a fully Open Access publication.
By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.
Learn More
Join the Discussion (0)
Become a Member or Sign In to Post a Comment