In this paper, a system for content-based image retrieval from video databases is introduced, using B-splines for affine invariant object representation. A small number of key-frames is extracted from each video sequence, which provide sufficient information about the video content. Color and motion segmentation and tracking is then employed for automatic extraction of video objects. A B-spline representation of the object contours is then obtained, which possesses important properties, such as smoothness, continuity and invariance under affine transformation. A neural network approach is used for supervised classification of video objects into prototype object classes. Finally, higher level classes can be constructed combining primary classes, providing the ability to obtain a high level of abstraction in the representation of each video sequence.
IEE Colloquium on Neural Nets and Multimedia, London, UK, 1998.
[ Bibtex ] [ PDF ]