]> Creating Interactive Timelines with JavaScript | by Shachee Swadia | Nightingale | Medium 2021-09-05 2021-09-05T03:21:04Z Sebastian Ruder Rabeeh Karimi Mahabadi Compacter: Efficient Low-Rank Hypercomplex Adapter Layers 2021-09-29T02:05:29Z James Henderson 2021-09-29 Adapting large-scale pretrained language models to downstream tasks via fine-tuning is the standard method for achieving state-of-the-art performance on NLP benchmarks. However, fine-tuning all weights of models with millions or billions of parameters is sample-inefficient, unstable in low-resource settings, and wasteful as it requires storing a separate copy of the model for each task. Recent work has developed parameter-efficient fine-tuning methods, but these approaches either still require a relatively large number of parameters or underperform standard fine-tuning. In this work, we propose Compacter, a method for fine-tuning large-scale language models with a better trade-off between task performance and the number of trainable parameters than prior work. Compacter accomplishes this by building on top of ideas from adapters, low-rank optimization, and parameterized hypercomplex multiplication layers. Specifically, Compacter inserts task-specific weight matrices into a pretrained model's weights, which are computed efficiently as a sum of Kronecker products between shared ``slow'' weights and ``fast'' rank-one matrices defined per Compacter layer. By only training 0.047% of a pretrained model's parameters, Compacter performs on par with standard fine-tuning on GLUE and outperforms fine-tuning in low-resource settings. Our code is publicly available in https://github.com/rabeehk/compacter/ [2106.04647] Compacter: Efficient Low-Rank Hypercomplex Adapter Layers 2106.04647 2021-06-08T19:17:04Z Rabeeh Karimi Mahabadi > Compacter (Compact Adapter) layers, a method to adapt large-scale language models, which only trains around 0.05% of a model's parameters and performs on par with fine-tuning. [twitter](https://twitter.com/KarimiRabeeh/status/1404774464441794560) 2021-06-08T19:17:04Z www.ingall-niger.org 2021-09-05 2021-09-05T17:30:02Z Une Histoire de l'Ighazer et de sa capitale, la petite ville d'In Gall, siège de la Cure Salée, la plus grande transhumance d'Afrique de l'ouest. > The report says, unironically, “we do not fully understand the nature or quality of the foundation that foundation models provide”, but then why grandiosely call them foundation models at all? 2021-09-12 2021-09-12T23:23:12Z Has AI found a new Foundation? Learning sentence embeddings often requires a large amount of labeled data. However, for most tasks and domains, labeled data is seldom available and creating it is expensive. In this work, we present a new state-of-the-art unsupervised method based on pre-trained Transformers and Sequential Denoising Auto-Encoder (TSDAE) which outperforms previous approaches by up to 6.4 points. It can achieve up to 93.1% of the performance of in-domain supervised approaches. Further, we show that TSDAE is a strong domain adaptation and pre-training method for sentence embeddings, significantly outperforming other approaches like Masked Language Model. A crucial shortcoming of previous studies is the narrow evaluation: Most work mainly evaluates on the single task of Semantic Textual Similarity (STS), which does not require any domain knowledge. It is unclear if these proposed methods generalize to other domains and tasks. We fill this gap and evaluate TSDAE and other recent approaches on four different datasets from heterogeneous domains. 2021-09-01 2021-08-30T18:23:40Z Nils Reimers Kexin Wang 2021-04-14T17:02:18Z Iryna Gurevych [2104.06979] TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning > The most successful previous approaches like InferSent (Conneau et al., 2017), Universial Sentence Encoder (USE) (Cer et al., 2018) and SBERT (Reimers and Gurevych, 2019) heavily relied on labeled data to train sentence embedding models. > > TSDAE can achieve up to 93.1% of the performance of indomain supervised approaches. Further, we show that TSDAE is **a strong domain adaptation and pre-training method for sentence embeddings**, significantly outperforming other approaches like Masked Language Model. > During training, TSDAE encodes corrupted sentences into fixed-sized vectors and requires the decoder to reconstruct the original sentences from this sentence embedding. - <https://www.sbert.net/examples/unsupervised_learning/TSDAE/README.html> - [github](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning/TSDAE) - [UKPLab/sentence-transformers: Sentence Embeddings with BERT & XLNet](doc:2020/07/ukplab_sentence_transformers_s) - [twitter](https://twitter.com/KexinWang2049/status/1433361957579538432): > **TSDAE can learn domain-specific sentence embeddings with unlabeled sentences** > > Most importantly, instead of STS (Semantic Textual Similarity), **we suggest evaluating unsupervised sentence embeddings on the domain-specific tasks&datasets, which is the real use case for them**. Actually, STS scores do not correlate with performance on specific tasks. TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning Kexin Wang 2021-09-01T16:43:01Z 2104.06979 Modeling AI on the Language of Brain Circuits and Architecture | Wu Tsai Neurosciences Institute 2021-09-12T23:47:24Z 2021-09-12 Nils Reimers sur Twitter : "Introduction - Neural Search" 2021-09-20T16:25:18Z 2021-09-20 2021-09-03T01:36:16Z 2021-09-03 Link Prediction with Graph Neural Networks and Knowledge Extraction > Many GNN layers have been able to be applied to the link prediction task directly. But due to some graph structure and graph neural network limitations, the performance of the neural style link prediction sometimes will be negatively influenced. To address these issues, we propose a novel approach to implicitly guide GNN with extracted knowledge. 2021-09-20T23:12:26Z 2021-09-20 Contextualized Topic Models > a family of topic models that use pre-trained representations of language (e.g., BERT) to support topic modeling. 2021-09-23T12:58:28Z 2021-09-23 Structuring Your Project — The Hitchhiker's Guide to Python 2021-09-26 2021-09-26T13:00:21Z Paleo Data Search | Search | National Centers for Environmental Information (NCEI) (((ل()(ل() 'yoav))))👾 sur Twitter : "Text-based NP Enrichment" 2021-09-28T08:17:14Z 2021-09-28 New NLP task: for every pair of base-NP (Noun Phrases) in the text, decide if they can be related by a preposition, and if so, which. 2021-09-17T14:08:09Z 2021-09-17 Dosso - TOUBAL N 06 Cory Doctorow sur Twitter : "#Facebook is a rotten company, rotten from the top down, its founder, board and top execs are sociopaths..." 2021-09-22T23:53:51Z 2021-09-22 [twitter](https://twitter.com/lena_voita/status/1434891467600941056) 2021-09-07T00:53:42Z 2021-09-07 NMT Training through the Lens of SMT La découverte d’empreintes humaines vieilles de 23 000 ans réécrit l’histoire du peuplement de l’Amérique 2021-09-24T12:28:21Z 2021-09-24 2021-09-26 « La décroissance n’est ni un programme ni même une théorie, mais une aspiration » 2021-09-26T16:04:25Z [deepset](doc:2021/09/nlp_solutions_to_streamline_neu) > Haystack is an **open-source framework** for building search systems that work intelligently over large document collections. Recent advances in NLP have enabled the application of question answering, retrieval and summarization to real world settings and Haystack is designed to be the bridge between research and industry. 2021-09-20T17:03:13Z 2021-09-20 Haystack (deepset) Build NLP features into your product | deepset 2021-09-20T17:00:13Z 2021-09-20 DICT-MLM: Improved Multilingual Pre-Training using Bilingual Dictionaries 2020-10-23T17:53:11Z 2010.12566 Aditi Chaudhary [2010.12566] DICT-MLM: Improved Multilingual Pre-Training using Bilingual Dictionaries Krishna Srinivasan Jiecao Chen 2020-10-23T17:53:11Z Aditi Chaudhary 2021-09-06T18:27:44Z Karthik Raman 2021-09-06 Pre-trained multilingual language models such as mBERT have shown immense gains for several natural language processing (NLP) tasks, especially in the zero-shot cross-lingual setting. Most, if not all, of these pre-trained models rely on the masked-language modeling (MLM) objective as the key language learning objective. The principle behind these approaches is that predicting the masked words with the help of the surrounding text helps learn potent contextualized representations. Despite the strong representation learning capability enabled by MLM, we demonstrate an inherent limitation of MLM for multilingual representation learning. In particular, by requiring the model to predict the language-specific token, the MLM objective disincentivizes learning a language-agnostic representation -- which is a key goal of multilingual pre-training. Therefore to encourage better cross-lingual representation learning we propose the DICT-MLM method. DICT-MLM works by incentivizing the model to be able to predict not just the original masked word, but potentially any of its cross-lingual synonyms as well. Our empirical analysis on multiple downstream tasks spanning 30+ languages, demonstrates the efficacy of the proposed approach and its ability to learn better multilingual representations. > Despite the strong representation learning capability enabled by MLM, we demonstrate an inherent limitation of MLM for multilingual representation learning. In particular, by requiring the model to predict the language-specific token, the MLM objective disincentivizes learning a language-agnostic representation -- which is a key goal of multilingual pre-training > > DICT-MLM works by incentivizing the model to be able to predict not just the original masked word, but potentially any of its crosslingual synonyms as well. 2021-09-03T01:09:03Z 2021-09-03 A Gentle Introduction to Graph Neural Networks > Le temps est venu... d’un « changement fondamental ». Certes, répond en substance la France... mais nous souhaitons malgré tout pouvoir continuer à enfreindre la loi pour permettre à une fraction de pourcent de nos concitoyens de s’adonner au plaisir de tuer des dizaines de milliers d’oiseaux en déclin. « L’enjeu environnemental est désormais au cœur d’une rupture du pacte démocratique » 2021-09-19 2021-09-19T10:20:13Z stanfordnlp/stanza: Official Stanford NLP Python Library for Many Human Languages 2021-09-20T16:54:01Z 2021-09-20 2021-09-25T16:52:04Z 2021-09-25 Song of Lawino Dense retrieval methods have shown great promise over sparse retrieval methods in a range of NLP problems. Among them, dense phrase retrieval-the most fine-grained retrieval unit-is appealing because phrases can be directly used as the output for question answering and slot filling tasks. In this work, we follow the intuition that retrieving phrases naturally entails retrieving larger text blocks and study whether phrase retrieval can serve as the basis for coarse-level retrieval including passages and documents. We first observe that a dense phrase-retrieval system, without any retraining, already achieves better passage retrieval accuracy (+3-5% in top-5 accuracy) compared to passage retrievers, which also helps achieve superior end-to-end QA performance with fewer passages. Then, we provide an interpretation for why phrase-level supervision helps learn better fine-grained entailment compared to passage-level supervision, and also show that phrase retrieval can be improved to achieve competitive performance in document-retrieval tasks such as entity linking and knowledge-grounded dialogue. Finally, we demonstrate how phrase filtering and vector quantization can reduce the size of our index by 4-10x, making dense phrase retrieval a practical and versatile solution in multi-granularity retrieval. 2021-09-30 princeton-nlp/DensePhrases 2021-09-30T14:52:17Z > DensePhrases is a text retrieval model that can return phrases, sentences, passages, or documents for your natural language inputs. Using billions of dense phrase vectors from the entire Wikipedia, DensePhrases searches phrase-level answers to your questions in real-time or retrieves passages for downstream tasks. cf.: - ACL'2021: Learning Dense Representations of Phrases at Scale; - EMNLP'2021: [Phrase Retrieval Learns Passage Retrieval, Too](doc:2021/09/2109_08133_phrase_retrieval_l) [2109.08133] Phrase Retrieval Learns Passage Retrieval, Too 2021-09-30T14:50:09Z 2021-09-16T17:42:45Z 2021-09-16T17:42:45Z Jinhyuk Lee Phrase Retrieval Learns Passage Retrieval, Too [Github](doc:2021/09/princeton_nlp_densephrases_acl) > Do we always need sentence vectors for sentence retrieval and passage vectors for passage retrieval? Our EMNLP2021 paper suggests that phrase vectors can serve as a basic building block for "multi-granularity" retrieval! [tweet](https://twitter.com/leejnhk/status/1441445536515584004) > > Phrases can be directly used as the output for question answering and slot filling tasks > > the **intuition that retrieving phrases naturally entails retrieving larger text blocks** 2109.08133 Danqi Chen 2021-09-30 Alexander Wettig Jinhyuk Lee 2021-09-23T10:42:17Z [tweet](doc:2021/09/koren_lazar_sur_twitter_m) > Akkadian language, the lingua franca of the time. > despite data scarcity (1M tokens) we can achieve state of the art performance on missing tokens prediction (89% hit@5) using a greedy decoding scheme and **pretraining on data from other languages and different time periods**. 2021-09-23T10:56:10Z [2109.04513] Filling the Gaps in Ancient Akkadian Texts: A Masked Language Modelling Approach 2021-09-23 [[2109.04513] Filling the Gaps in Ancient Akkadian Texts: A Masked Language Modelling Approach](doc:2021/09/2109_04513_filling_the_gaps_i) Koren Lazar sur Twitter : "...Modern pre-trained language models are applicable even in extreme low-resource settings as the case of the ancient Akkadian language." 2021-09-23 > This [article](https://towardsdatascience.com/building-a-sentence-embedding-index-with-fasttext-and-bm25-f07e7148d240) covers sentence embeddings and how codequestion built **a fastText + BM25 embeddings search**. Source code can be found on github. Same people as [neuml/txtai: Build AI-powered semantic search applications](doc:2021/09/neuml_txtai_build_ai_powered_s) Building a sentence embedding index with fastText and BM25 | by David Mezzetti | Towards Data Science neuml/txtai: Build AI-powered semantic search applications 2021-09-30 2021-09-30T14:39:57Z 2021-09-30 2021-09-30T14:45:22Z