]> 2021-05-08 2021-05-08T14:02:59Z High-depth African genomes inform human migration and health | Nature (2020) Eugene Ie Alessandro Presta > We show that it is feasible to perform **entity linking by training a dual encoder (two-tower) model that encodes mentions and entities in the same dense vector space**, where candidate entities are retrieved by approximate nearest neighbor search. Unlike prior work, **this setup does not rely on an alias table followed by a re-ranker, and is thus the first fully learned entity retrieval model**. Contributions: > - a dual encoder architecture for learning entity and mention encodings suitable for retrieval. A key feature of the architecture is that it employs a modular **hierarchy of sub-encoders that capture different aspects of mentions and entities** > - a simple, fully unsupervised **hard negative mining** strategy that produces massive gains in retrieval performance, compared to using only random negatives > - high quality candidate entities very efficiently using approximate nearest neighbor search > - outperforms discrete retrieval baselines like an alias table or BM25 > strong retrieval performance across all 5.7 million Wikipedia entities in around 3ms per mention > since we are using a two-tower or dual encoder architecture, **our model cannot use any kind of attention over both mentions and entities at once**, nor feature-wise comparisons as done by Francis-Landau et al. (2016). This is a fairly severe constraint – for example, **we cannot directly compare the mention span to the entity title** – but it permits retrieval with nearest neighbor search for the entire context against a single, all encompassing representation for each entity We show that it is feasible to perform entity linking by training a dual encoder (two-tower) model that encodes mentions and entities in the same dense vector space, where candidate entities are retrieved by approximate nearest neighbor search. Unlike prior work, this setup does not rely on an alias table followed by a re-ranker, and is thus the first fully learned entity retrieval model. We show that our dual encoder, trained using only anchor-text links in Wikipedia, outperforms discrete alias table and BM25 baselines, and is competitive with the best comparable results on the standard TACKBP-2010 dataset. In addition, it can retrieve candidates extremely fast, and generalizes well to a new dataset derived from Wikinews. On the modeling side, we demonstrate the dramatic value of an unsupervised negative mining algorithm for this task. 1909.10506 Sayali Kulkarni Daniel Gillick 2019-09-23T17:52:34Z Larry Lansing Diego Garcia-Olano [1909.10506] Learning Dense Representations for Entity Retrieval Learning Dense Representations for Entity Retrieval 2019-09-23T17:52:34Z 2021-05-01T09:11:15Z Jason Baldridge 2021-05-01 Daniel Gillick Sinong Wang 2021-04-29T22:52:26Z Large pre-trained language models (LMs) have demonstrated remarkable ability as few-shot learners. However, their success hinges largely on scaling model parameters to a degree that makes it challenging to train and serve. In this paper, we propose a new approach, named as EFL, that can turn small LMs into better few-shot learners. The key idea of this approach is to reformulate potential NLP task into an entailment one, and then fine-tune the model with as little as 8 examples. We further demonstrate our proposed method can be: (i) naturally combined with an unsupervised contrastive learning-based data augmentation method; (ii) easily extended to multilingual few-shot learning. A systematic evaluation on 18 standard NLP tasks demonstrates that this approach improves the various existing SOTA few-shot learning methods by 12\%, and yields competitive few-shot performance with 500 times larger models, such as GPT-3. > a new approach, named as EFL, that can turn small LMs into better few-shot learners. The key idea of this approach is to reformulate potential NLP task into an entailment one, and then fine-tune the model with as little as 8 examples > > For instance, we can reformulate a sentiment classification task as a textual entailment one with an input sentence S1 as xin = [CLS]S1[SEP]S2[EOS]; where S2 = This indicates positive user sentiment, and let the language modelMto determine the if input sentence S1 entails the label description S2 Han Fang 2104.14690 2021-05-03T23:05:39Z Hao Ma 2021-04-29T22:52:26Z [2104.14690] Entailment as Few-Shot Learner 2021-05-03 Entailment as Few-Shot Learner Hanzi Mao Madian Khabsa Sinong Wang maps, timeline, bibliographies thématiques 2021-05-03 2021-05-03T00:42:29Z The History of West Africa at a Glance 2021-05-04T23:23:44Z 2021-05-04 Inria Paris NLP (ALMAnaCH team) sur Twitter : "#PAGnol, a new, free, GPT-3-like generative LM for French Cites [Matching the Blanks: Distributional Similarity for Relation Learning](doc:2021/05/1906_03158_matching_the_blank) Nicholas FitzGerald General purpose relation extractors, which can model arbitrary relations, are a core aspiration in information extraction. Efforts have been made to build general purpose extractors that represent relations with their surface forms, or which jointly embed surface forms with relations from an existing knowledge graph. However, both of these approaches are limited in their ability to generalize. In this paper, we build on extensions of Harris' distributional hypothesis to relations, as well as recent advances in learning text representations (specifically, BERT), to build task agnostic relation representations solely from entity-linked text. We show that these representations significantly outperform previous work on exemplar based relation extraction (FewRel) even without using any of that task's training data. We also show that models initialized with our task agnostic representations, and then tuned on supervised relation extraction datasets, significantly outperform the previous methods on SemEval 2010 Task 8, KBP37, and TACRED. Tom Kwiatkowski 2019-06-07T15:26:50Z Livio Baldini Soares 2021-05-13T00:39:03Z Matching the Blanks: Distributional Similarity for Relation Learning 2021-05-13 Jeffrey Ling > a new method of learning relation representations directly from text > > First, we study the **ability of the Transformer neural network architecture (Vaswani et al., 2017) to encode relations between entity pairs**, and we identify a method of representation that outperforms previous work in supervised relation extraction. Then, we present a method of training this relation representation **without any supervision from a knowledge graph or human annotators** from widely available distant supervision in the form of entity linked text > > **we assume** access to a corpus of text in which entities have been linked to unique identifiers and we define a relation statement to be a block of text containing two marked entities. Livio Baldini Soares [1906.03158] Matching the Blanks: Distributional Similarity for Relation Learning 2019-06-07T15:26:50Z 1906.03158 CTLR@WiC-TSV: Target Sense Verification using Marked Inputs and Pre-trained Models (2021) 2021-05-13T00:29:13Z 2021-05-13 [Refers to](doc:2021/05/ctlr_wic_tsv_target_sense_veri) Is Word Sense Disambiguation outdated? | by Anna Breit | May, 2021 | Medium 2021-05-13T00:27:16Z 2021-05-13 Heinrich Barth and the Western Sudan 2021-05-05T10:30:25Z 2021-05-05 Yann LeCun sur Twitter : "Barlow Twins: a new super-simple self-supervised method to train joint-embedding architectures (aka Siamese nets) non contrastively. " 2021-05-09T23:49:08Z 2021-05-09 2021-05-10 2021-05-10T08:19:14Z fastai v2 cheat sheets DIRT Discovery of inference rules from text (2001) > unsupervised method for discovering inference rules from text, such as "X is author of Y ≈ X wrote Y", "X solved Y ≈ X found a solution to Y", and "X caused Y ≈ Y is triggered by X". > Our algorithm is based on an **extended version of Harris' Distributional Hypothesis**, which states that words that occurred in the same contexts tend to be similar. Instead of using this hypothesis on words, we apply it to paths in the dependency trees of a parsed corpus. [Cited by](doc:2021/05/1906_03158_matching_the_blank) 2021-05-13 2021-05-13T00:56:25Z 2021-05-14 2021-05-14T10:08:29Z How can synaptic plasticity lead to meaningful learning? 2021-05-10 Alex Russell sur Twitter : "If you install Firefox on Windows, MacOS, Linux, ChromeOS, or Android you get *real* Firefox, complete with the Gecko engine. But not on iOS. Apple cripples engine competition in silent, deeply impactful ways." / Twitter 2021-05-10T23:30:43Z