Semanlink - [1911.11506] Word-Class Embeddings for Multiclass Text Classification

Tags:

> In supervised tasks such as multiclass
text classification (the focus of this article) it seems appealing to enhance word representations
with ad-hoc embeddings that encode task-specific information. We propose (supervised) word-class
embeddings (WCEs), and show that, when concatenated to (unsupervised) pre-trained word embeddings,
they substantially facilitate the training of deep-learning models in multiclass classification by
topic.
>
> A differentiating aspect of our method is that it keeps the modelling of word-class interactions separate from the
original word embedding. Word-class correlations are confined in a dedicated vector space, whose vectors enhance
(by concatenation) the unsupervised representations. The net effect is an embedding matrix that is better suited to
classification, and imposes no restriction to the network architecture using it.

[github](https://github.com/AlexMoreo/word-class-embeddings). Refers to [LEAM](doc:2020/02/joint_embedding_of_words_and_la) :

> [in LEAM] Once words and labels are embedded in a common vector space, word-label
compatibility is measured via cosine similarity. Our method instead models these compatibilities directly, without
generating intermediate embeddings for words or labels.

About This Document

sl:arxiv_author :
sl:arxiv_firstAuthor : Alejandro Moreo
sl:arxiv_num : 1911.11506
sl:arxiv_published : 2019-11-26T13:11:00Z
sl:arxiv_summary : Pre-trained word embeddings encode general word semantics and lexical regularities of natural language, and have proven useful across many NLP tasks, including word sense disambiguation, machine translation, and sentiment analysis, to name a few. In supervised tasks such as multiclass text classification (the focus of this article) it seems appealing to enhance word representations with ad-hoc embeddings that encode task-specific information. We propose (supervised) word-class embeddings (WCEs), and show that, when concatenated to (unsupervised) pre-trained word embeddings, they substantially facilitate the training of deep-learning models in multiclass classification by topic. We show empirical evidence that WCEs yield a consistent improvement in multiclass classification accuracy, using four popular neural architectures and six widely used and publicly available datasets for multiclass text classification. Our code that implements WCEs is publicly available at https://github.com/AlexMoreo/word-class-embeddings@en
sl:arxiv_title : Word-Class Embeddings for Multiclass Text Classification@en
sl:arxiv_updated : 2019-11-26T13:11:00Z
sl:bookmarkOf : https://arxiv.org/abs/1911.11506
sl:creationDate : 2020-10-11
sl:creationTime : 2020-10-11T19:29:28Z
sl:relatedDoc : http://www.semanlink.net/doc/2020/02/joint_embedding_of_words_and_la

File info

Bookmark of: https://arxiv.org/abs/1911.11506

Documents with similar tags (experimental)

[1805.04174] Joint Embedding of Words and Labels for Text Classification (ACL Anthology 2018)

Tags:

2020-02-18 About