Browse all publications by topic
Browse all publications by year
- C. Scheel, N. Neubauer,
A. Lommatzsch, K. Obermayer, and S. Albayrak. Efficient Query Delegation
by Detecting Redundant Retrieval Strategies.
.
In SIGIR Workshop on Learning to Rank for Information Retrieval
2007, 2007.
(FTP PDF, 2739 kb)
The task of combining the output of several retrieval strategies
into a single relevance prediction per document is known as data fusion. The
LETOR dataset provides three corpora with predictions of 25 or 44 strategies
(depending on the corpus) per document/query pair. Given such a large number
of basic strategies, a point which is equally crucial as optimality of the
combination, in our view, is its sparseness: Which strategies should be used
in a real application when each strategy consumes resources? We hence focus
on the question of ''query delegation'', a special case of weighting
strategies: Which strategies should be weighted greater than zero, i.e.,
asked in the first place? We propose several similarity measures between
strategies like various correlation measures or precision@n. Assuming that
similar strategies may not contribute much to each other's results, we
perform a clustering based on these similarities and only consider the best
representative of each cluster. We show that this fusion strategy performs
comparably to other fusion approaches like RankSVM or RankBoost, but only
needs to consult a fraction of the available retrieval
strategies.
|