[1602.01137] A Dual Embedding Space Model for Document Ranking (2016)
Investigate neural word embeddings as a source of evidence in document ranking. Presented in [this Stanford course on IR](/doc/?uri=https%3A%2F%2Fweb.stanford.edu%2Fclass%2Fcs276%2Fhandouts%2Flecture20-distributed-representations.pdf) by Chris Manning (starting slide 44) They train a word2vec model, but retain both the input and the output projections. > During ranking we map the query words into the input space and the document words into the output space, and compute a query-document relevance score by aggregating the cosine similarities across all the query-document word pairs. > However, when ranking a larger set of candidate documents, we find the embeddings-based approach is prone to false positives
About This Document
File info