Learning Semantic Similarity for Very Short Texts (Arxiv, submitted on 2 Dec 2015) In order to pair short text
fragments—as a concatenation of separate words—an adequate
distributed sentence representation is needed. Main contribution: a first step towards a hybrid method that
combines the strength of dense distributed representations—
as opposed to sparse term matching—with the strength of
tf-idf based methods. The combination of word embeddings and tf-idf
information might lead to a better model for semantic content
within very short text fragments.