This paper presents a framework for the detection of semantic features in video sequences. Low-level feature extraction is performed on the keyframes of the shots and a "feature vector" including color and texture features is formed. A region "thesaurus" that contains all the high-level features is constructed using a subtractive clustering method.Then, a "model vector" that contains the distances from each region type is formed and a SVM detector is trained for each semantic concept. Experiments were performed using TRECVID 2005 development data.
1st International Conference on Semantics And digital Media Technology, Athens, Greece, December 2006.
[ Bibtex ] [ PDF ]