This paper deals with the problem of saliency map estimation in computational models of visual attention. In particular, we propose a wavelet based approach for efficient computation of the topographic feature maps. Given that wavelets and multiresolution theory are naturally connected the usage of wavelet decomposition for mimicking the center surround process in humans is an obvious choice. However, our proposal goes further. We utilize the wavelet decomposition for inline computation of the features (such as orientation) that are used to create the topographic feature maps. Topographic feature maps are then combined through a sigmoid function to produce the final saliency map. The computational model we use is based on the Feature Integration Theory of Treisman et al and follows the computational philosophy of this theory proposed by Itti et al. A series of experiments, conducted in a video encoding setup, show that the proposed method compares well against other implementations found in the literature both in terms of visual trials and computational complexity.
International Conference on Artificial Neural Networks, Athens, Greece, September 2006.
[ Bibtex ] [ PDF ]