One of the most successful techniques in information retrieval is the query expansion (QE) principle, which is a kind of automatic relevance feedback. The general idea is to exploit the reliable results returned by an initial query to produce an enriched representation, which is re-submitted in turn to the search engine. Query expansion has been introduced to the visual domain by Chum et al. and several extensions have been proposed to improve this initial QE scheme. Although these variants have improved the accuracy, they suffer from two inherent drawbacks which severely affect the overall complexity and quality of the search. First, they all require a costly geometrical verification step. Second, the augmented query representation contains significantly more non-zero components than the original one, which severely slows down the search.
Our baseline system consists of the Hamming Embedding technique with burstiness normalization and multiple assignement of visual words to the query features. A simple rule based on the number of correspondences formed with a strict threshold on the hamming distance is used to detect reliable images. These are exploited to form an enhanced query representation. As shown in the example below, this simple rule is sufficient to correctly detect true relevant images as reliable images (green border).
The most frequent visual words which appear among reliable images are identified as the most useful ones for the expanded query. Features mapped to those visual words are called reliable features. Their binary signatures are aggregated per visual word in order to form a more compact representation which is finally proven to further be more robust.
Our method is shown to be effective even without the use of geometry verification in contrast to previous query expansion methods. However, combined with spatial matching performance is further increased as shown in our experiments.
We evaluate the proposed method on two publicly available datasets of this kind, namely Oxford5k Buildings and Paris, but also on a dataset where queries have only few corresponding images, that is UKB. Our method Hamming Query Expansion (HQE) is shown to always improve the performance of the baseline retrieval system based on Hamming Embedding (HE). When combined with geometry (HQE-SP) the performance is further improved.
We compare the QE proposed method with previously published results on the same datasets. For a fair comparison, we have included the scores of other QE methods that have used the same local feature detector as input and learned the vocabulary on an independent dataset. We also include the scores for our method when using 128-bit signatures for HE, which are better at the cost of higher memory usage and a slightly larger complexity. Interestingly, even without spatial matching, our method outperforms all methods in Oxford105k and Paris6k dataset. HQE-SP outperforms them in all three datasets. All of the compared methods rely on spatial matching to verify similar images and expand the initial query.