The task of multimedia document categorization forms a well-known problem in information retrieval. The task is to assign a multimedia document to one or more categories, based on its contents. In this case, effective management and thematic categorization requires the extraction of the underlying semantics. The proposed approach utilizes as input, analyzes and exploits the textual annotation that accompanies a multimedia document, in order to extract its underlying semantics, construct a semantic index and finally classify the documents to thematic categories. This process is based on a unified knowledge and semantics representation model introduced, as well as basic principles of fuzzy relational algebra. On top of that the fuzzy extension of expressive description logic language SHIN, f-SHIN and its reasoning services are used to further refine and optimize the initial categorization results. The proposed approach was tested on a set of real-life multimedia documents, derived from the Internet, as well as personal databases and shows rather promising results.
Knowledge Acquisition from Multimedia Content Workshop, in conjunction with SAMT, Genova, Italy, December 2007.
[ Bibtex ] [ PDF ]