Semanlink - [1909.04164] Knowledge Enhanced Contextual Word Representations

Tags:

About This Document

sl:arxiv_author :
sl:arxiv_firstAuthor : Matthew E. Peters
sl:arxiv_num : 1909.04164
sl:arxiv_published : 2019-09-09T21:18:50Z
sl:arxiv_summary : Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities. We propose a general method to embed multiple knowledge bases (KBs) into large scale models, and thereby enhance their representations with structured, human-curated knowledge. For each KB, we first use an integrated entity linker to retrieve relevant entity embeddings, then update contextual word representations via a form of word-to-entity attention. In contrast to previous approaches, the entity linkers and self-supervised language modeling objective are jointly trained end-to-end in a multitask setting that combines a small amount of entity linking supervision with a large amount of raw text. After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation. KnowBert's runtime is comparable to BERT's and it scales to large KBs.@en
sl:arxiv_title : Knowledge Enhanced Contextual Word Representations@en
sl:arxiv_updated : 2019-10-31T00:14:48Z
sl:bookmarkOf :
sl:creationDate : 2020-05-13
sl:creationTime : 2020-05-13T01:44:51Z

File info

Bookmark of: https://www.aclweb.org/anthology/D19-1005.pdf

Linked From

[2110.08151] mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models

Tags:

[Ikuya Yamada sur Twitter : "Is entity representation effective to improve multilingual language models?..."](doc:2022/04/ikuya_yamada_sur_twitter_is_)

> Recent studies have shown that multilingual pretrained language models can be effectively improved with cross-lingual alignment information from Wikipedia entities. However, **existing methods only exploit entity information in pretraining and do not explicitly use entities in downstream tasks**. In this study, we explore the **effectiveness of leveraging entity representations for downstream cross-lingual tasks**.
>
> the key insight is that incorporating entity representations into the input allows us to extract more language-agnostic features.

[Github](https://github.com/studio-ousia/luke)

> Entity representations are known to enhance
language models in mono-lingual settings
(Zhang et al., 2019: [ERNIE](tag:ernie.html); Peters et al., 2019:  [[1909.04164] Knowledge Enhanced Contextual Word Representations](doc:2020/05/1909_04164_knowledge_enhanced); Wang et al.,
2021 [[1911.06136] KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation](doc:2020/11/1911_06136_kepler_a_unified_); Xiong et al., 2020; Yamada et al., 2020: [[2010.01057] LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention](doc:2020/11/2010_01057_luke_deep_context))
presumably by introducing real-world knowledge.
We show that using entity representations facilitates
cross-lingual transfer by providing languageindependent
features.
>
> Multilingual extension of LUKE. The model is trained with the multilingual
masked language modeling (MLM) task as well
as the masked entity prediction (MEP) task with
Wikipedia entity embeddings

> We investigate two ways of using the entity representations
in cross-lingual transfer tasks:
> 1. perform
entity linking for the input text, and append
the detected entity tokens to the input sequence.
The entity tokens are expected to provide language independent
features to the model
> 2. use the entity
[MASK] token from the MEP task as a languageindependent
feature extractor.

2022-04-17 About

[2102.07043] Reasoning Over Virtual Knowledge Bases With Open Predicate Relations

Tags:

> a method for constructing **a virtual KB (VKB) trained entirely from text**

Open Predicate Query Language (OPQL): constructing a virtual knowledge base (VKB) that supports KB reasoning & open-domain QA, tackling the incompleteness of knowledge bases by constructing a virtual KB only from text

> OPQL constructs
a VKB by **encoding and indexing a set of
relation mentions** in a way that naturally enables
reasoning and can be trained without any structured
supervision.

> can be used
as an **external memory integrated into a language
model**

cf. this earlier paper [[2002.10640] Differentiable Reasoning over a Virtual Knowledge Base](doc:2020/07/2002_10640_differentiable_rea). But does not require an initial structured KB for distant
supervision.

> The key idea in constructing the OPQL VKB is to use a
dual-encoder pre-training process, similar to 
[[1906.03158] Matching the Blanks: Distributional Similarity for Relation Learning](doc:2021/05/1906_03158_matching_the_blank)

Related work section refers to [[1909.04164] Knowledge Enhanced Contextual Word Representations](doc:2020/05/1909_04164_knowledge_enhanced). Also refers to [[2007.00849] Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge](doc:2020/07/2007_00849_facts_as_experts_) (some authors in common)

2021-06-20 About

[1911.03681] E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT

Tags:

> way of **injecting factual knowledge about entities into the pretrained BERT model**.

(Feeding entity vectors
into BERT as if they
were wordpiece vectors without additional encoder
pretraining)

>
> **We align [Wikipedia2Vec](tag:wikipedia2vec) entity vectors (Yamada et al., 2016) with BERT's native wordpiece vector space and use the aligned entity vectors as if they were wordpiece vectors**. The resulting entity-enhanced version of BERT (called E-BERT) is similar in spirit to [ERNIE](tag:ernie) (Zhang et al., 2019) and [KnowBert](tag:knowbert) (Peters et al., 2019), but it **requires no expensive further pretraining of the BERT encoder**.
>
> Our vector space alignment strategy is inspired by
cross-lingual word vector alignment

Related work on Entity-enhanced BERT:

> ([ERNIE](doc:2019/08/_1905_07129_ernie_enhanced_la) and [Knowbert](doc:2020/05/1909_04164_knowledge_enhanced)) are based on the design principle
that BERT be adapted to entity vectors. They introduce
new encoder layers to feed pretrained entity
vectors into the Transformer, and they require additional
pretraining to integrate the new parameters.
In contrast, E-BERT’s design principle is that entity
vectors be adapted to BERT.
>
> Two other knowledge-enhanced MLMs are [KEPLER](doc:2020/11/1911_06136_kepler_a_unified_)
(Wang et al., 2019c) and K-Adapter (Wang
et al., 2020)... Their factual knowledge
does not stem from entity vectors – instead, they
are trained in a multi-task setting on relation classification
and knowledge base completion.

Not to be cofounded with [[2009.02835] E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce](doc:2020/12/2009_02835_e_bert_a_phrase_a)

2021-01-12 About

Documents with similar tags (experimental)

[1902.06006] Contextual Word Representations: A Contextual Introduction

Tags:

2022-07-08 About

[1911.06136] KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation

Tags:

2020-11-03 About

[1906.07241] Barack's Wife Hillary: Using Knowledge-Graphs for Fact-Aware Language Modeling

Tags:

2020-05-11 About