The continuously growing volume of multimedia content has led many research efforts to high-level concept detection, since the semantics a document contains provide an effective and desirable annotation of its content. However, detecting the actual semantics within image and video documents remains still a challenging, yet unsolved problem. Its two main and most interesting aspects are the selection of the low-level features to be extracted and the method that will be used for assigning low-level descriptions to high-level concepts. Finding an automatic transition from the low-level features to semantic entities or equivalently the automatic extraction of high-level characteristics is an extremely hard task, a problem commonly referred to as the “Semantic Gap”. Many descriptors have been proposed that capture the audio, color, texture, shape and motion characteristics of audiovisual documents, or in other words, their low-level features. On the other hand, many techniques such as neural networks, fuzzy systems, and support vector machines have been successfully applied in the aforementioned problem.
Encyclopedia of Multimedia, Springer, pp.151-155, 2008.
[ Bibtex ] [ PDF ]