> "**Turn the search problem around to predict the input**"
> We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is **a statistical model of how a user might distill or "translate" a given document into a query**. To assess the relevance of a document to a user's query, **we estimate the probability that the query would have been generated as a translation of the document**, and factor in the user's general preferences in the form of a prior distribution over documents. We propose a simple, well motivated model of the document-to-query translation process, and describe an algorithm for learning the parameters of this model in an unsupervised manner from a collection of documents