SIF embeddings
"Smoothed Inverse Frequency": a linear representation of a sentence which is better than the simple average of the embeddings of its words 2 ideas: - assign to each word a weighting that depends on the frequency of the word it the corpus (reminiscent of TF-IDF) - some denoising (removing the component from the top singular direction) Todo (?): check implementation as a [sklearn Vectorizer](https://github.com/ChristophAlt/embedding_vectorizer)
Related Tags:
ExpandDescendants
6 Documents (Long List
Properties