An overview of word embeddings and their connection to distributional semantic models - AYLIEN (2016) > While on the surface DSMs and word embedding models use varying algorithms to learn word representations – the former count, the latter predict – both types of model fundamentally act on the same underlying statistics of the data, i.e. the co-occurrence counts between words...
> These results are in contrast to the general consensus that word embeddings are superior to traditional methods. Rather, they indicate that it typically makes no difference whatsoever whether word embeddings or distributional methods are used. What really matters is that your hyperparameters are tuned and that you utilize the appropriate pre-processing and post-processing steps.
Text Classification With Word2Vec - DS lore Overall, we won’t be throwing away our SVMs any time soon in favor of word2vec but it has it’s place in text classification.
> 1. SVM’s are pretty great at text classification tasks
> 2. Models based on simple averaging of word-vectors can be surprisingly good too (given how much information is lost in taking the average)
> 3. but they only seem to have a clear advantage when there is ridiculously little labeled training data
> Update 2017: actually, the best way to utilise the pretrained embeddings would probably be this [using keras](https://blog.keras.io/using-pre-trained-word-embeddings-in-a-keras-model.html)