A Simple but Tough-to-Beat Baseline for Sentence Embeddings (2017)(About) > Use word embeddings computed using one of the popular methods on unlabeled corpus like Wikipedia, represent the sentence by a weighted average of the word vectors, and then modify them a bit using PCA/SVD
See also [youtube: Sanjeev Arora on "A theoretical approach to semantic representations"](https://www.youtube.com/watch?v=KR46z_V0BVw)
An overview of word embeddings and their connection to distributional semantic models - AYLIEN (2016)(About) > While on the surface DSMs and word embedding models use varying algorithms to learn word representations – the former count, the latter predict – both types of model fundamentally act on the same underlying statistics of the data, i.e. the co-occurrence counts between words...
> These results are in contrast to the general consensus that word embeddings are superior to traditional methods. Rather, they indicate that it typically makes no difference whatsoever whether word embeddings or distributional methods are used. What really matters is that your hyperparameters are tuned and that you utilize the appropriate pre-processing and post-processing steps.