Fusing MPEG-7 visual descriptors for image classiffication

E. Spyrou, H. Le Borgne, T. Mailis, E. Cooke, Y. Avrithis, N. O'Connor

International Conference on Artificial Neural Networks, Warsaw, Poland, September 2005.

This paper proposes a number of content-based image classification techniques based on fusing various low-level MPEG-7 visual descriptors. The goal is to fuse several descriptors in order to improve the performance of several machine-learning classifiers. Fusion is necessary as descriptors would be otherwise incompatible and inappropriate to directly include e.g. in a Euclidean distance. Three approaches are described: A merging fusion combined with an SVM classifier, a back-propagation fusion combined with a K-Nearest Neighbor classifier and a Fuzzy-ART neurofuzzy network. In the latter case, fuzzy rules can be extracted in an effort to bridge the semantic gap between the low- level descriptors and the high-level semantics of an image. All networks were evaluated using content from the aceMedia Repositoryand more specifically in a beach/urban scenes classification problem.

[ Bibtex ] [ PDF ]