This work presents an approach on high-level semantic feature detection in video sequences. Keyframes are selected to represent the visual content of the shots. Then, low-level feature extraction is performed on the keyframes and a feature vector including color and texture features is formed. A region thesaurus that contains all the high-level features is constructed using a subtractive clustering method where each feature results as the centroid of a cluster. Then, a model vector that contains the distances from each region type is formed and a SVM detector is trained for each semantic concept. The presented approach is also extended using Latent Semantic Analysis as a further step to exploit co-occurrences of the region-types. High-level concepts detected are desert, vegetation, mountain, road, sky and snow within TV news bulletins. Experiments were performed with TRECVID 2005 development data.
Emerging Artificial Intelligence Applications in Computer Engineering, IOS Press, Amsterdam, Netherlands, 2007.
[ Bibtex ] [ PDF ]