]> 2020-12-17T14:10:56Z 2020-12-17 L’hydrogène tiendra-t-il ses promesses ? | CNRS Le journal 2020-12-19 > Après l’annonce du départ de Palantir, les étudiants de Stanford ont célébré leur victoire sur Instagram, Facebook ou TikTok. Puis, le soir même, certains ont probablement regardé une série Netflix, commandé un plat sur Uber Eats ou se sont offert un cadeau sur Amazon. Il fallait bien fêter ça. 2020-12-19T13:46:16Z Exploitation des données, manipulation de l’opinion, culte du secret… La trahison des GAFA INITIATION AUX ÉTUDES HISTORIQUES 2020-12-30 2020-12-30T11:13:42Z How to improve Elasticsearch search relevance with boolean queries | Elastic Blog 2020-12-02T14:05:46Z 2020-12-02 2020-12-03T01:29:13Z FP Servant sur Twitter : "constructing a personal knowledge graph as a support for learning (and a metaphor of the learning experience)..." 2020-12-03 > constructing a personal knowledge graph as a support for learning (and a metaphor of the learning experience). From googling, browsing wikipedia/KBs, discovering new words and concepts to organizing all of this into your own concept graph = acquiring knowledge. Semanlink: my digital twin? 2020-12-17T22:40:56Z > a BERT-like transformers model pretrained on a large corpus of English data from Wikipedia in a self-supervised fashion 2020-12-17 google/tapas-base-finetuned-wtq · Hugging Face Vaccins : « La France doit d’urgence donner à sa recherche les moyens de ses ambitions » > Le pays de Pasteur se voyait en premier de la classe -- LOL 2020-12-17T00:11:20Z 2020-12-17 > « La plupart des politiques ne comprennent pas que, dans ce genre de dynamique, c’est quand on passe de 2 cas à 4 cas qu’il faut réagir » L'exponentielle, c'est plus fort que toi 2020-12-05 2020-12-05T16:51:35Z > « La gestion du Covid-19 ressemble à celle du réchauffement climatique : même procrastination du pouvoir devant la certitude du désastre » > a method to train an information retrieval module for downstream tasks, **without using pairs of queries and documents as annotations**. Uses two models (standard pipeline for open-domain QA): - the first one retrieves documents from a large source of knowledge (the retriever) - the second one processes the support documents to solve the task (the reader). > First the retriever selects support passages in a large knowledge source. Then these passages are processed by the reader, along with the question, to generate an answer Inspired by knowledge distillation: the reader model is the teacher and the retriever is the student. > More precisely, we use a sequence-to-sequence model as the reader, and use the attention activations over the input documents as synthetic labels to train the retriever. > (**train the retriever by learning to approximate the attention score of the reader**) Refers to: - [REALM: Retrieval-Augmented Language Model Pre-Training](doc:2020/12/2002_08909_realm_retrieval_a) - [Dehghani: Neural Ranking Models with Weak Supervision](doc:?uri=https%3A%2F%2Farxiv.org%2Fabs%2F1704.08803) Distilling Knowledge from Reader to Retriever for Question Answering Gautier Izacard Edouard Grave 2020-12-08T17:36:34Z 2020-12-11 2012.04584 2020-12-08T17:36:34Z Gautier Izacard [2012.04584] Distilling Knowledge from Reader to Retriever for Question Answering 2020-12-11T16:48:13Z The task of information retrieval is an important component of many natural language processing systems, such as open domain question answering. While traditional methods were based on hand-crafted features, continuous representations based on neural networks recently obtained competitive results. A challenge of using such methods is to obtain supervised data to train the retriever model, corresponding to pairs of query and support documents. In this paper, we propose a technique to learn retriever models for downstream tasks, inspired by knowledge distillation, and which does not require annotated pairs of query and documents. Our approach leverages attention scores of a reader model, used to solve the task based on retrieved documents, to obtain synthetic labels for the retriever. We evaluate our method on question answering, obtaining state-of-the-art results. 2020-02-10T18:40:59Z Kelvin Guu Panupong Pasupat 2002.08909 Zora Tung Kenton Lee REALM: Retrieval-Augmented Language Model Pre-Training 2020-12-12 Ming-Wei Chang 2020-02-10T18:40:59Z Kelvin Guu **Augment language model pre-training with a retriever module**, which is trained using the masked language modeling objective. > To capture knowledge in a more modular and interpretable way, we augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. **For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner**, using masked language modeling as the learning signal and backpropagating through a retrieval step that considers millions of documents Hum, #TODO: parallel to be drawn with techniques in [KG-augmented Language Models](tag:knowledge_graph_augmented_language_models) which focus "on the problem of capturing declarative knowledge in the learned parameters of a language model." [Google AI Blog Post](doc:2020/08/google_ai_blog_realm_integrat) [Summary](https://joeddav.github.io/blog/2020/03/03/REALM.html) for the [Hugging Face awesome-papers reading group](doc:2021/03/huggingface_awesome_papers_pap) Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network, requiring ever-larger networks to cover more facts. To capture knowledge in a more modular and interpretable way, we augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner, using masked language modeling as the learning signal and backpropagating through a retrieval step that considers millions of documents. We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA). We compare against state-of-the-art models for both explicit and implicit knowledge storage on three popular Open-QA benchmarks, and find that we outperform all previous methods by a significant margin (4-16% absolute accuracy), while also providing qualitative benefits such as interpretability and modularity. [2002.08909] REALM: Retrieval-Augmented Language Model Pre-Training 2020-12-12T02:30:25Z > Current state-of-the-art approaches for named entity recognition (NER) using BERT-style transformers typically use one of two different approaches: > >1. The first fine-tunes the transformer itself on the NER task and adds only a simple linear layer for word-level predictions. >2. The second uses the transformer only to provide features to a standard LSTM-CRF sequence labeling architecture and thus performs no fine-tuning. > > In this paper, we perform a comparative analysis of both approaches Conclusion: > We recommend the combination of document-level features and fine-tuning for NER. Stefan Schweter Stefan Schweter 2020-11-13T16:13:59Z 2011.06993 FLERT: Document-Level Features for Named Entity Recognition 2020-12-01 2020-12-01T09:25:14Z 2020-11-13T16:13:59Z [2011.06993] FLERT: Document-Level Features for Named Entity Recognition Current state-of-the-art approaches for named entity recognition (NER) using BERT-style transformers typically use one of two different approaches: (1) The first fine-tunes the transformer itself on the NER task and adds only a simple linear layer for word-level predictions. (2) The second uses the transformer only to provide features to a standard LSTM-CRF sequence labeling architecture and thus performs no fine-tuning. In this paper, we perform a comparative analysis of both approaches in a variety of settings currently considered in the literature. In particular, we evaluate how well they work when document-level features are leveraged. Our evaluation on the classic CoNLL benchmark datasets for 4 languages shows that document-level features significantly improve NER quality and that fine-tuning generally outperforms the feature-based approaches. We present recommendations for parameters as well as several new state-of-the-art numbers. Our approach is integrated into the Flair framework to facilitate reproduction of our experiments. Alan Akbik E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce 2020-09-10T23:00:16Z Yanchi Liu 2020-12-14 2020-12-14T19:15:04Z AutoPhrase: Automated Phrase Mining from Massive Text Corpora Zixuan Yuan Zuohui Fu Haifeng Chen [2009.02835] E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce Denghui Zhang Hui Xiong 2020-12-14 2020-12-14T11:10:29Z E-BERT, pre-training framework for product data. 1. to benefit from phrase-level knowledge: Adaptive Hybrid Masking, a new masking strategy, which allows the model to adaptively switch from learning preliminary word knowledge to learning complex phrases 2. leveraging product-level knowledge: training E-BERT to predict a product’s associated neighbors (product association) Resources used: - description of millions of products from the amazon dataset (title, description, reviews) - e-commerce phrases: extracted from above dataset using [AutoPhrase](doc:2020/12/autophrase_automated_phrase_mi) - product association graph: pairs of substitutable and complementary products extracted from amazon dataset Not to be confounded with [[1911.03681] E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT](doc:2021/01/1911_03681_e_bert_efficient_) 2020-09-07T00:15:36Z 2009.02835 Pengyang Wang Pre-trained language models such as BERT have achieved great success in a broad range of natural language processing tasks. However, BERT cannot well support E-commerce related tasks due to the lack of two levels of domain knowledge, i.e., phrase-level and product-level. On one hand, many E-commerce tasks require an accurate understanding of domain phrases, whereas such fine-grained phrase-level knowledge is not explicitly modeled by BERT's training objective. On the other hand, product-level knowledge like product associations can enhance the language modeling of E-commerce, but they are not factual knowledge thus using them indiscriminately may introduce noise. To tackle the problem, we propose a unified pre-training framework, namely, E-BERT. Specifically, to preserve phrase-level knowledge, we introduce Adaptive Hybrid Masking, which allows the model to adaptively switch from learning preliminary word knowledge to learning complex phrases, based on the fitting progress of two modes. To utilize product-level knowledge, we introduce Neighbor Product Reconstruction, which trains E-BERT to predict a product's associated neighbors with a denoising cross attention layer. Our investigation reveals promising results in four downstream tasks, i.e., review-based question answering, aspect extraction, aspect sentiment classification, and product classification. Denghui Zhang Fuzhen Zhuang > This paper looks at some of the most important impacts of the economic sanctions imposed on Venezuela by the US government since August of 2017. It finds that most of the impact of these sanctions has not been on the government but on the civilian population. Economic Sanctions as Collective Punishment: The Case of Venezuela 2020-12-22T21:00:32Z 2020-12-22 2020-12-10T13:30:11Z Drew Tada sur Twitter : "Officially launching giantgra.ph A search engine for knowledge graphs..." 2020-12-10 Built on Wikipedia’s link structure, it returns a subgraph of connected pages. - <https://giantgra.ph> - [Blog post](https://kcollective.substack.com/p/exploration-engines) 2020-12-10T13:37:11Z 2020-12-10 Exploration Engines - the koodos collective Wikigraph uses the simplest possible algorithm to generate graphs, which is particularly good for making unexpected connection discoveries. 2004.10964 Language models pretrained on text from a wide variety of sources form the foundation of today's NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the domain of a target task. We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, showing that a second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains, under both high- and low-resource settings. Moreover, adapting to the task's unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining. Finally, we show that adapting to a task corpus augmented using simple data selection strategies is an effective alternative, especially when resources for domain-adaptive pretraining might be unavailable. Overall, we consistently find that multi-phase adaptive pretraining offers large gains in task performance. 2020-04-23T04:21:19Z 2020-12-01 > a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, showing that a second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains, **under both high- and low-resource settings**. Moreover, **adapting to the task's unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining**. Swabha Swayamdipta Doug Downey 2020-12-01T15:43:33Z Suchin Gururangan 2020-05-05T22:00:44Z Suchin Gururangan Don't Stop Pretraining: Adapt Language Models to Domains and Tasks Kyle Lo Noah A. Smith Ana Marasović Iz Beltagy [2004.10964] Don't Stop Pretraining: Adapt Language Models to Domains and Tasks 2020-12-19T11:29:40Z 2020-12-19 xkcd: Git Le Niger appelé à élire le successeur de Mahamadou Issoufou 2020-12-27T11:55:23Z 2020-12-27 > **Il est un fait qui devrait relever de la normalité électorale, mais qui distingue pourtant le Niger des autres pays de la région.** Dimanche 27 décembre, le président sortant ne sera pas candidat à sa réélection, pour un troisième mandat à la tête du pays. La Constitution ne l’y autorisait pas. Mahamadou Issoufou – qui fêtera ses 69 ans le 1er janvier – n’a pas essayé de la réécrire pour s’éterniser au pouvoir. GitHub - explosion/sense2vec: Contextually-keyed word vectors 2020-12-31 2020-12-31T10:14:41Z 2020-12-11T13:34:30Z 2020-12-11 Supporting content decision makers with machine learning | Dec, 2020 | Netflix TechBlog 2020-12-10 2020-12-10T22:42:01Z Julie Grollier, a (bio)inspired researcher | CNRS News 2020-12-13 2020-12-13T11:14:22Z Yrjänä Rankka @ghard@mastodon.social sur Twitter : "Facebook must be razed to the ground..." > Facebook must be razed to the ground leaving no stone on top of another. Spread salt on the ruins so nothing shall grow where it once was. Erase the very memory of it. Carthago delenda est 2020-12-12T14:53:09Z 2020-12-12 Michel Zecler au « Monde » : « Il fallait que ces trois policiers se sentent en confiance pour aller aussi loin dans leurs actes » 2020-12-22T22:14:15Z 2020-12-22 elvis sur Twitter : "Today I kept thinking about the machine learning / NLP / deep learning related blog posts (not papers) that have been transformational for me..." Douglas Kennedy : « A l’ère de la “cancel culture” – où un simple bon mot peut chambouler votre carrière –, surveiller ce qu’on dit en public est devenu crucial » > o “brasileiro só será livre quando o último Bolsonaro for enforcado nas tripas do último pastor da Igreja Universal”. 2020-12-26 2020-12-26T13:01:23Z Digital Billboards Are Tracking You - Consumer Reports 2020-12-06 2020-12-06T12:21:17Z [Tweet](https://twitter.com/thomasgermain/status/1197201725708476422) > the out-of-home advertising business is adopting the model that runs ads on the web > Today, the internet is altering the way we experience the physical world so it should be hackable 2020-12-17T14:39:56Z 2020-12-17 The event extraction task formulated as a [Question Answering](tag:question_answering)/machine reading comprehension task. > Existing work in event argument extraction typically relies heavily on entity recognition as a preprocessing/concurrent step, causing the well-known problem of error propagation. To avoid this issue, we introduce a new paradigm for event extraction by formulating it as a question answering (QA) task that extracts the event arguments in an end-to-end manner [GitHub](https://github.com/xinyadu/eeqa) Related to [[1902.10909] BERT for Joint Intent Classification and Slot Filling](doc:2020/01/_1902_10909_bert_for_joint_int) Event Extraction by Answering (Almost) Natural Questions would you like to dance with me? 2020-12-30T13:11:32Z 2020-12-30 Do You Love Me? - Boston Dynamics video 2020-12-19 2020-12-19T15:13:47Z 'The platypuses were glowing': the secret light of Australia's marsupials | Science | The Guardian Retour sur Terre de Chang’e-5, une sonde spatiale chinoise transportant des échantillons lunaires 2020-12-16 2020-12-16T23:02:41Z 2020-12-13T23:54:41Z 2020-12-13 TextGraphs 2020 2020-12-14 « Le devoir de toute société humaine est de se protéger contre les déviances de ceux qui détruisent la planète » > **La responsabilité des chefs d’entreprise et des dirigeants politiques qui, face à la crise climatique, prennent des décisions à l’encontre de l’intérêt général, doit être engagée**, estiment l’investisseur Bertrand Badré, l’écrivain Erik Orsenna et le psychiatre et entrepreneur Bertrand Piccard. 2020-12-14T16:37:01Z 2020-12-30 Un même gène a permis « d’inventer » l'hémoglobine plusieurs fois | CNRS 2020-12-30T13:41:51Z 2020-12-19 2020-12-19T11:26:15Z Pablo Castro sur Twitter : "Knowledge mining using the knowledge store feature of #AzureSearch" Domain-Specific BERT Models · Chris McCormick 2020-12-01 2020-12-01T15:08:22Z Chances are you won’t be able to pre-train BERT on your own dataset, for the following reasons: 1. Pre-training BERT requires a huge corpus 2. Huge Model + Huge Corpus = Lots of GPUs 2020-12-01 2020-12-01T15:45:06Z Salmon Run: Word Sense Disambiguation using BERT as a Language Model 2020-12-06 2020-12-06T10:07:17Z Keyword Extraction with BERT | Towards Data Science A minimal method for extracting keywords and keyphrases. [GitHub](https://github.com/MaartenGr/KeyBERT/) > uses BERT-embeddings and simple cosine similarity to find the sub-phrases in a document that are the most similar to the document itself. "la chapelle sixtine amazonienne" 2020-12-28T19:15:56Z 2020-12-28 Serranía de La Lindosa 2020-12-05 Knowledge Base Embedding By Cooperative Knowledge Distillation - ACL Anthology 2020-12-05T11:03:01Z 2020-12-09 Google AI Blog: Reformer: The Efficient Transformer 2020-12-09T12:07:13Z Katalin Kariko 2020-12-16 2020-12-16T01:38:11Z 2020-12-18T05:37:54Z 2020-12-18 Pablo Castro sur Twitter : "Random finding of the day for word embeddings: vec("apple")-vec("apples") yields a vector close to ipad, ipod, etc. (apples removes the "fruitness" from apple) 2020-12-06 La sonde japonaise Hayabusa-2 a rapporté des échantillons d’astéroïde sur Terre 2020-12-06T22:31:06Z Understanding Graph Embeddings| by Dan McCreary | Nov, 2020 | Medium 2020-12-06 2020-12-06T10:21:10Z > Graph embeddings are data structures used for fast-comparison of similar data structures Entre colère et culpabilité, ces Français qui renoncent à manifester par peur des violences > « Vous n’aviez qu’à rester chez vous ! », ont répondu les policiers à Cécile, qui cherchait à s’extraire d’une nasse 2020-12-19T17:15:56Z 2020-12-19 2020-12-12T01:40:23Z 2020-12-12 pemistahl/lingua: natural language detection library for Java suitable for long and short text alike