A team led by ACM A.M. Turing Award recipient Geoffrey Hinton has developed an idea for a vision system that combines the strengths of five advances in neural networks — Transformers, Neural Fields, Contrastive Representation Learning, Distillation, and Capsules — to enable neural networks with fixed architectures to parse an image into a part-whole hierarchy with different structures for each image.
The system, dubbed GLOM, is described in "How to Represent Part-Whole Hierarchies in a Neural Network."
The paper does not describe a working system. "Instead, it presents a single idea about representation which allows advances made by several different groups to be combined into an imaginary system called GLOM," the paper states. "If GLOM can be made to work, it should significantly improve the interpretability of the representations produced by transformer-like systems when applied to vision or language."
View Full Article
No entries found