Bottom up approaches to Visual Attention (VA) have been applied successfully in a variety of applications, where no domain information exists, e.g. general purpose image and video segmentation. In face detection, humans perform conscious search; therefore, bottom up approaches are not so efficient. In this paper we introduce the inclusion of two channels in the VA architecture proposed by Itti et al [8] to account for motion and conscious search in a scene. Increasing the channels in the architecture requires an efficient way of combining the various maps that are produced. We solve this problem by using an innovative committee machine scheme which allows for dynamically changing the committee members (maps) and weighting the maps according to the confidence level of their estimation. The overall VA architecture achieves significantly better results compared with the simple skin based face detection as shown in the experimental results.
International Conference on Image Processing, Volume 2, pp.1298-1301, September 2005.
[ Bibtex ] [ PDF ]