Semanlink - Word embeddings

Parents:

Wikipedia Word embeddings

[Best presentation](doc:2020/06/on_word_embeddings) about word embeddings, by [Sebastian Ruder](tag:sebastian_ruder)

Capture the idea that one can express “meaning” of words using a vector, so that the cosine of the angle between the vectors captures semantic similarity.

A set of language modeling and feature learning techniques where words from the vocabulary (and possibly phrases thereof) are mapped to vectors of real numbers in a low dimensional space, relative to the vocabulary size.

~ Context-predicting models

~ Latent feature representations of words

Paramaterized function mapping words in some language to vectors (perhaps 200 to 500 dimensions). Conceptually it involves a mathematical embedding from a space with one dimension per word to a continuous vector space with much lower dimension.

"Plongement lexical" in French

Word embedding of a word: a succinct representation of the distribution of other words around this word.

Methods to generate the mapping include neural networks, dimensionality reduction on the word co-occurrence matrix, probabilistic models, and explicit representation in terms of the context in which words appear.

In the new generation of models, the vector estimation problem is handled as a supervised task, where the weights in a word vector are set to maximize the probability of the contexts in which the word is observed in the corpus

The mapping may be generated training a neural network on a large corpus to predict a word given a context (Continuous Bag Of Words model) or to predict the context given a word (skip gram model). The context is a window of surrounding words.

The most known software to produce word embeddings is Tomas Mikolov's Word2vec. Pre-trained word embeddings are also available in the word2vec code.google page.

Applications:

- search document ranking
- boost the performance in NLP tasks such as syntactic parsing and sentiment analysis.

Related Tags:

Descendants

67 Documents (Long List)

I.A.B sur Twitter : "When we "know the meaning" of a word, what is it that we know? For example, what does knowing the words "dolphin" and "tiger" entail?..."

Tags:

2022-04-15 About

Pablo Castro sur Twitter : "Random finding of the day for word embeddings: vec("apple")-vec("apples") yields a vector close to ipad, ipod, etc. (apples removes the "fruitness" from apple)

Tags:

2020-12-18 About

BERT Word Embeddings Tutorial · Chris McCormick

Tags:

2020-07-06 About

[1910.00163] Specializing Word Embeddings (for Parsing) by Information Bottleneck

Tags:

2020-06-29 About

On word embeddings

Tags:

2020-06-05 About

Approximating the Softmax for Learning Word Embeddings

Tags:

2020-06-04 About

[1709.03933] Hash Embeddings for Efficient Word Representations

Tags:

2020-05-19 About

[1810.04882] Towards Understanding Linear Word Analogies

Tags:

2019-06-24 About

Kawin Ethayarajh sur Twitter : "When and why does king - man + woman = queen?"

Tags:

Word embeddings

2019-06-24 About

Word Embeddings: 6 Years Later

Tags:

Word embeddings

2019-06-03 About

Word Embeddings: Explaining their properties – Off the convex path (2016)

Tags:

2019-03-20 About

Word Mover's Embedding: From Word2Vec to Document Embedding (2018)

Tags:

2018-11-10 About

Can Global Semantic Context Improve Neural Language Models? - Apple (2018)

Tags:

2018-09-27 About

Linear algebraic structure of word meanings – Off the convex path

Tags:

2018-09-20 About

Simple and efficient semantic embeddings for rare words, n-grams, and language features – Off the convex path

Tags:

2018-09-18 About

A Latent Variable Model Approach to PMI-based Word Embeddings (2016)

Tags:

2018-08-28 About

[1601.03764] Linear Algebraic Structure of Word Senses, with Applications to Polysemy

Tags:

> Here it is shown that multiple word senses reside
in linear superposition within the word
embedding and simple sparse coding can recover
vectors that approximately capture the
senses

> Each extracted word sense is accompanied by one of about  2000 “discourse atoms” that gives a succinct description of which other words co-occur with that word sense.

> The success of the approach is mathematically explained using a variant of
the random walk on discourses model

("random walk": a generative model for language). Under the assumptions of this model,  there
exists a linear relationship between the vector of a
word w and the vectors of the words in its contexts (It is not the average of the words in w's context, but in a given corpus the matrix of the linear relationship does not depend on w. It can be estimated, and so we can compute the embedding of a word from the contexts it belongs to)

[Related blog post](/doc/?uri=https%3A%2F%2Fwww.offconvex.org%2F2016%2F07%2F10%2Fembeddingspolysemy%2F)

2018-08-28 About

zalandoresearch/flair: A very simple framework for state-of-the-art NLP

Tags:

2018-08-24 About

Contextual String Embeddings for Sequence Labeling (2018)

Tags:

Word embeddings

2018-08-24 About

what are the pros and cons of the various unsupervised word and sentence/ document embedding models? - Quora

Tags:

2018-08-19 About

Sanjeev Arora on "A theoretical approach to semantic representations" - YouTube (2016)

Tags:

2018-06-10 About

A Word Embedding Approach to Predicting the Compositionality of Multiword Expressions (2015)

Tags:

2018-06-08 About

The Current Best of Universal Word Embeddings and Sentence Embeddings (2018)

Tags:

2018-05-30 About

A Theoretical Approach to Semantic Coding and Hashing | Simons Institute for the Theory of Computing (2016)

Tags:

2018-05-26 About

A simple spell checker built from word vectors – Ed Rushton – Medium

Tags:

2018-05-25 About

NLP: Any libraries/dictionaries out there for fixing common spelling errors? - Part 2 & Alumni - Deep Learning Course Forums

Tags:

2018-05-18 About

NLP using Word Vectors with Spacy - CLDSPN | Kaggle

Tags:

2018-05-11 About

Dict2vec : Learning Word Embeddings using Lexical Dictionaries

Tags:

2018-05-11 About

Improving Word Embedding Compositionality using Lexicographic Definitions

Tags:

2018-05-10 About

[1803.05651] Word2Bits - Quantized Word Vectors

Tags:

2018-03-20 About

Deep learning with word embeddings improves biomedical named entity recognition | Bioinformatics | Oxford Academic (2017)

Tags:

2018-03-05 About

Improving the Compositionality of Word Embeddings (2017)

Tags:

(MS thesis, a [paper at TheWebConf 2018](/doc/?uri=https%3A%2F%2Fdoi.org%2F10.1145%2F3178876.3186007))

> This thesis explores a method to find better encodings of meaning a computer can work with. We specifically want to combine encodings of word meanings in such a way that a good encoding of their joint meaning is created. The act of combining multiple representations of meaning into a new representation of meaning is called semantic composition.

Analysis of four word embeddings (Word2Vec, GloVe, fastText and Paragram)  in terms of their semantic compositionality. A method to tune these embeddings towards better compositionality, using a simple neural network architecture with definitions and lemmas from WordNet.

> Since dictionary definitions are semantically similar to their associated lemmas, they are the ideal candidate for our tuning method, as well as evaluating for compositionality. Our architecture allows for the embeddings to be composed using simple arithmetic operations, which makes these embeddings specifically suitable for production applications such as web search and data mining. We also explore more elaborate and involved compositional models, such as recurrent composition and convolutional composition.

2018-02-13 About

[1412.6623] Word Representations via Gaussian Embedding

Tags:

2018-01-28 About

RESEARCH TRACK: Web Content Analysis, Semantics and Knowledge

Tags:

[CFP](https://www2018.thewebconf.org/call-for-papers/research-tracks-cfp/web-content-analysis/)

> In previous years, ‘content analysis’ and ‘semantic and knowledge’ were in separate track. This year, we combined these tracks to emphasize the close relationship between these topics; **the use of content to curate knowledge and the use of knowledge to guide content analysis and intelligent usage**.

Some of the accepted papers:
### [Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN](https://doi.org/10.1145/3178876.3186005)

[Hierarchical Text Classification](/tag/nlp_hierarchical_text_classification): Text classification to a hierarchical taxonomy of topics, using graph representation of text, and CNN over this graph

Renvoie à ce qui a été vu dans le tutorial "Graph-based Text Representations"

from the abstract:

> a graph-CNN based deep learning model to first convert texts to graph-of-words, and then use graph convolution operations to convolve the word graph. Graph-of-words representation of texts has the advantage of capturing non-consecutive and long-distance semantics. CNN models have the advantage of learning different level of semantics. To further leverage the hierarchy of labels, we regularize the deep architecture with the dependency among labels

Conversion of text to graph: potentially given a single document

### [Weakly-supervised Relation Extraction by Pattern-enhanced Embedding Learning](https://doi.org/10.1145/3178876.3186024 )

Extraction de relations de corpus de textes de façon semi-supervisée, dans un contexte où on a peu de données labellisées décrivant les relations.

Par exemple, des données labellisées indique que le texte "Beijing, capital of China" correspond à la relation entre entités : ("Beijing", "Capital Of", "China), et on voudrait pouvoir extraire les entités et relations pertinentes à partir de texte tel que "Paris, France's capital,..."

Le papier décrit une méthode qui combine deux modules, l'un basé sur l'extraction automatique de patterns (par ex "[Head], Capital Of [Tail]") et l'autre sur la "sémantique distributionnelle" (du type "word embeddings"). Ces deux modules collaborent, le premier permettant de créer des instances de relations augmentant la base de connaissance sur lequel entrainer le second, et le second aidant le premier à déterminer des patterns informatifs ("co-entrainement")

### [Scalable Instance Reconstruction in Knowledge Bases via Relatedness Affiliated Embedding](https://doi.org/10.1145/3178876.3186017)

Knowledge base completion problem: usually, it is formulated as a link prediction problem, but not here. A novel knowledge embedding model ("Joint Modelling and Learning of Relatedness and Embedding")

### [Improving Word Embedding Compositionality using Lexicographic Definitions](https://doi.org/10.1145/3178876.3186007)

comment obtenir les meilleures représentations de texte à partir de représentations de mots (word embeddings) ? L'auteur utilise des ressources lexicographiques (wordnet) pour ses tests : l'embedding obtenu pour la définition d'un mot est-il proche de celui du mot ?

Le papier s'appuie sur une [thèse du même auteur](https://esc.fnwi.uva.nl/thesis/centraal/files/f1554608041.pdf), claire et bien écrite.

### [CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information](https://doi.org/10.1145/3178876.3186030)

Amélioration de l'extraction de triplets (nom phrase, property, nom phrase) à partir de texte en calculant des embeddings pour les "nom phrases" (~entités)

### [Short-Text Topic Modeling via Non-negative Matrix Factorization Enriched with Local Word-Context Correlations](https://doi.org/10.1145/3178876.3186009)

Topic modeling for short texts, leveraging the word-context semantic correlations in the training

### [Towards Annotating Relational Data on the Web with Language Models](https://doi.org/10.1145/3178876.3186029)

### A paper by [David Blei](/tag/david_blei): (Dynamic Embeddings for Language Evolution)

2018-01-27 About

[1712.09405] Advances in Pre-Training Distributed Word Representations

Tags: