Word2Bits - Quantized Word Vectors (2018)(About) We show that high quality quantized word vectors using 1-2 bits per parameter can be learned by introducing a quantization function into Word2Vec. We furthermore show that training with the quantization function acts as a regularizer
An overview of word embeddings and their connection to distributional semantic models - AYLIEN (2016)(About) > While on the surface DSMs and word embedding models use varying algorithms to learn word representations – the former count, the latter predict – both types of model fundamentally act on the same underlying statistics of the data, i.e. the co-occurrence counts between words...
> These results are in contrast to the general consensus that word embeddings are superior to traditional methods. Rather, they indicate that it typically makes no difference whatsoever whether word embeddings or distributional methods are used. What really matters is that your hyperparameters are tuned and that you utilize the appropriate pre-processing and post-processing steps.
Text Classification With Word2Vec - DS lore(About) > Overall, we won’t be throwing away our SVMs any time soon in favor of word2vec but it has it’s place in text classification.
> 1. SVM’s are pretty great at text classification tasks
> 2. Models based on simple averaging of word-vectors can be surprisingly good too (given how much information is lost in taking the average)
> 3. but they only seem to have a clear advantage when there is ridiculously little labeled training data
> Update 2017: actually, the best way to utilise the pretrained embeddings would probably be this [using keras](https://blog.keras.io/using-pre-trained-word-embeddings-in-a-keras-model.html)
Sample code to benchmark a few text categorization models to test whehter word embeddings like word2vec can improve text classification accuracy.
Sample code (based on scikit-learn) includes an embedding vectorizer that is given embedding dataset and vectorizes texts by taking the mean of all the vectors corresponding to individual words.