Wikipedia Ranking (information retrieval)
Related Tags:
2 Documents (Long List
  • [1602.01137] A Dual Embedding Space Model for Document Ranking (2016) (About)
    Investigate neural word embeddings as a source of evidence in document ranking. Presented in [this Stanford course on IR](/doc/? by Chris Manning (starting slide 44) They train a word2vec model, but retain both the input and the output projections. > During ranking we map the query words into the input space and the document words into the output space, and compute a query-document relevance score by aggregating the cosine similarities across all the query-document word pairs. > However, when ranking a larger set of candidate documents, we find the embeddings-based approach is prone to false positives
  • Information Retrieval as Statistical Translation (Adam Berger , John Lafferty, 1999) (About)
    > "**Turn the search problem around to predict the input**" > We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is **a statistical model of how a user might distill or "translate" a given document into a query**. To assess the relevance of a document to a user's query, **we estimate the probability that the query would have been generated as a translation of the document**, and factor in the user's general preferences in the form of a prior distribution over documents. We propose a simple, well motivated model of the document-to-query translation process, and describe an algorithm for learning the parameters of this model in an unsupervised manner from a collection of documents