- Subword-level embeddings: several methods:
> Word embeddings have been augmented with subword-level information for many applications such as named entity recognition, POS, ..., Language Modeling.
> Most of these models employ a CNN or a BiLSTM that takes as input the characters of a word and outputs a character-based word representation.
> For incorporating character information into pre-trained embeddings, however, **character n-grams features** have been shown to be more powerful. [#FastText]
> Subword units based on **byte-pair encoding** have been found to be particularly useful for machine translation where they have replaced words as the standard input units
- Out-of-vocabulary (OOV) words
- Polysemy. Multi-sense embeddings
- [Towards a Seamless Integration of Word Senses into Downstream NLP Applications](/doc/?uri=https%3A%2F%2Farxiv.org%2Fabs%2F1710.06632)