Thèse IRIT-Renault: papiers étudiés
2 Documents (Long List
  • Information Retrieval as Statistical Translation (Adam Berger , John Lafferty, 1999) (About)
    > "**Turn the search problem around to predict the input**" > We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is **a statistical model of how a user might distill or "translate" a given document into a query**. To assess the relevance of a document to a user's query, **we estimate the probability that the query would have been generated as a translation of the document**, and factor in the user's general preferences in the form of a prior distribution over documents. We propose a simple, well motivated model of the document-to-query translation process, and describe an algorithm for learning the parameters of this model in an unsupervised manner from a collection of documents
  • Ranking Measures and Loss Functions in Learning to Rank (2009) (About)
    > While most learning-to-rank methods learn the ranking function by minimizing the loss functions, it is the ranking measures (such as NDCG and MAP) that are used to evaluate the performance of the learned ranking function. In this work, we reveal the relationship between ranking measures and loss functions in learning-to-rank methods, such as Ranking SVM, RankBoost, RankNet, and ListMLE. > we have proved that many pairwise/listwise losses in learning to rank are actually upper bounds of measure-based ranking errors. As a result, the minimization of these loss functions will lead to the maximization of the ranking measures. The key to obtaining this result is to model ranking as a sequence of classification tasks, and define a so-called essential loss as the weighted sum of the classification errors of individual tasks in the sequence. > We have also shown a way to improve existing methods by introducing appropriate weights to their loss functions.