Image and Video Analysis


DRVQ is a C++ library implementation of dimensionality-recursive vector quantization, a fast vector quantization method in high-dimensional Euclidean spaces under arbitrary data distributions. It is an approximation of k-means that is practically constant in data size and applies to arbitrarily high dimensions but can only scale to a few thousands of centroids. As a by-product of training, a tree structure performs either exact or approximate quantization on trained centroids, the latter being not very precise but extremely fast.

ivl is a general purpose, full-header template C++98 math library with convenient and powerful syntax. It extends standard C++ syntax towards mathematical notation, while making use of language features like classes, functions, operators and templates. Often resembling a new language, it targets concise, readable, yet efficient code. Visit ivl directly at

ViRaL is a content-based image search engine. It does not only retrieve visually similar images, but also identifies where a photo is taken, suggests tags and recognizes landmarks and points of interest. Its dataset consisting of more than 2 Million Flickr images from 40 cities around the world. The query may be uploaded, fetched from a given a URL, or chosen from the dataset. Try it online directly:

Hough Pyramid Matching (HPM) is a flexible spatial matching model which allows non-rigid motion and multiple matching surfaces or objects. It is fast enough to be used for geometric re-ranking in large scale image retrieval. Binary code for experimental comparison with the proposed approach is provided for Linux, along with documentation.

The medial feature detector (MFD) is a generic detector of regions of arbitrary scale and shape in still images. The strongest regions are mostly blob-like and well enclosed by boundaries. It has been tested successfully in image matching and retrieval applications, with state of the art performance and savings in computational and space requirements. Binary code is provided for Linux and Windows, along with documentation and examples.

The goal of this tool is to demonstrate the integration of several low- to high-level analysis algorithms toward semantic indexing of images. Different modules created by several research groups have been included and results are presented graphically in a unified way.

Annotator is an image annotation tool that supports semi-automatic annotation. Semi-automatic annotation is based on the Viola and Jones object detection algorithm implemented in OpenCV. User can create, edit and delete annotation in any image with the least possible effort. The annotation is stored in text files in OpenCV format.

Visual Descriptor Applications are developed to facilitate the automated extraction (VDE) and matching (VDM) of MPEG-7 Visual Descriptors from images. All 8 descriptors supported by VDE can be extracted from whole or parts of images, which means that depending on the existence of a binary mask file, a segmentation mask or a set of bounding box coordinates the extraction mechanism is able to calculate the descriptors either for specific image regions or the entire image. The produced output can be either in xml format or plain text. VDM supports the matching of the same 8 descriptors by getting as input xml files generated by VDE.