]> 2020-10-11 2020-10-11T01:12:13Z Top 6 Open Source Pretrained Models for Text Classification you should use > Part II of my hydrogen deep dive: the demand side. TLDR: Hydrogen will play a vital role as chemical feedstock, including for shipping and aviation fuels, and as guarantor of resilience in a renewables-based power system. EVERYTHING else goes electric. 2020-10-16T14:42:10Z Liebreich: Separating Hype from Hydrogen – Part Two: The Demand Side | BloombergNEF 2020-10-16 Guillaume Lample sur Twitter : "Last year, we showed that you can outperform a 24-layer transformer in language modeling with just... [This](doc:2019/07/_1907_05242_large_memory_layer) was last year 2020-10-10 2020-10-10T03:04:51Z > Semantic Annotation Service for 100 Languages 2020-10-11T02:13:46Z 2020-10-11 Wikifier 2020-10-24 2020-10-24T15:07:41Z Quand l'histoire fait dates - arte.tv 2020-10-18 2020-10-18T23:03:23Z html - JavaScript network visualization? - Stack Overflow 2020-10-15T17:56:46Z 2020-10-15 Top Trends of Graph Machine Learning in 2020 | by Sergei Ivanov | Towards Data Science 2020-10-26T23:39:02Z 2020-10-26 NASA’s OSIRIS-REx Spacecraft Collects Significant Amount of Asteroid | NASA 2020-10-07T15:48:25Z Le prix Nobel de chimie décerné à la Française Emmanuelle Charpentier et l’Américaine Jennifer Doudna pour les « ciseaux moléculaires » 2020-10-07 Classification using multimodal data arises in many machine learning applications. It is crucial not only to model cross-modal relationship effectively but also to ensure robustness against loss of part of data or modalities. In this paper, we propose a novel deep learning-based multimodal fusion architecture for classification tasks, which guarantees compatibility with any kind of learning models, deals with cross-modal information carefully, and prevents performance degradation due to partial absence of data. We employ two datasets for multimodal classification tasks, build models based on our architecture and other state-of-the-art models, and analyze their performance on various situations. The results show that our architecture outperforms the other multimodal fusion architectures when some parts of data are not available. EmbraceNet: A robust deep learning architecture for multimodal classification 1904.09078 Jun-Ho Choi [1904.09078] EmbraceNet: A robust deep learning architecture for multimodal classification 2020-10-14 2020-10-14T09:55:10Z Jong-Seok Lee Jun-Ho Choi 2019-04-19T04:46:29Z 2019-04-19T04:46:29Z Will a Half-Step by Macron Be Enough to Blunt France’s Second Wave? - The New York Times 2020-10-15T23:44:16Z 2020-10-15 2020-10-11T02:11:40Z 2020-10-11 TAGME: on-the-fly annotation of short text fragments! > TAGME is a powerful tool that is able to identify on-the-fly meaningful short-phrases (called "spots") in an unstructured text and link them to a pertinent Wikipedia page in a fast and effective way. 2020-10-04T23:31:57Z 2020-10-04 Which flavor of BERT should you use for your QA task? | by Olesya Bondarenko | Towards Data Science A guide to choosing and benchmarking BERT models for question answering 2020-10-15T01:59:30Z 2020-10-15 « La dispersion des graines a permis à Dame Nature de parfaire ses qualités d’ingénieur aéronautique » Towards Unsupervised Text Classification Leveraging Experts and Word Embeddings - (ACL 2019) Unsupervised approach to classify documents into categories simply described by a label > The proposed method... draws on textual similarity between the most relevant words in each document and a dictionary of keywords for each category reflecting its semantics and lexical field. The novelty of our method hinges on the enrichment of the category labels through a combination of human expertise and language models, both generic and domain specific. > models the task as a **text similarity problem between two sets of words: One containing the most relevant words in the document and another containing keywords derived from the label of the target category**. While the key advantage of this approach is its simplicity, its success hinges on the good definition of a dictionary of words for each category. 2020-10-05T00:28:20Z 2020-10-05 2020-10-05 2020-10-05T00:09:59Z Unsupervised text classification with word embeddings - Max Halford Title was "Classifying documents without any training data". Mentions this [paper](doc:2020/10/towards_unsupervised_text_class) En Californie, la « gig economy » soumise à référendum 2020-10-14 2020-10-14T19:26:53Z 2020-10-21T04:32:50Z 2020-10-21 La sonde américaine Osiris-Rex a réussi sa manœuvre sur l’astéroïde Bénou 2020-10-23T14:45:36Z 2020-10-23 Construire un bon analyzer français pour Elasticsearch 2020-10-07T08:34:39Z 2020-10-07 Clarifying exceptions and visualizing tensor operations in deep learning code 2020-10-10 2020-10-10T03:07:51Z A Trick Of The Light - Villagers | Berlin Live – ARTE Concert (2019) Vincent Guigue 2020-01-22T15:15:34Z Bruno Taillé 2020-10-01T11:43:28Z Bruno Taillé Patrick Gallinari 2020-10-01 Contextualized embeddings use unsupervised language model pretraining to compute word representations depending on their context. This is intuitively useful for generalization, especially in Named-Entity Recognition where it is crucial to detect mentions never seen during training. However, standard English benchmarks overestimate the importance of lexical over contextual features because of an unrealistic lexical overlap between train and test mentions. In this paper, we perform an empirical analysis of the generalization capabilities of state-of-the-art contextualized embeddings by separating mentions by novelty and with out-of-domain evaluation. We show that they are particularly beneficial for unseen mentions detection, especially out-of-domain. For models trained on CoNLL03, language model contextualization leads to a +1.2% maximal relative micro-F1 score increase in-domain against +13% out-of-domain on the WNUT dataset Contextualized Embeddings in Named-Entity Recognition: An Empirical Study on Generalization 2001.08053 [2001.08053] Contextualized Embeddings in Named-Entity Recognition: An Empirical Study on Generalization 2020-01-22T15:15:34Z > In this paper, we quantify the impact of ELMo, Flair and BERT representations on generalization to unseen mentions and new domains in NER. Representation learning of knowledge graphs with entity descriptions (AAAI 2016) 2020-10-02 2020-10-02T00:37:27Z "Description-Embodied Knowledge Representation Learning" (DKRL) > In most knowledge graphs there are usually concise descriptions for entities, which cannot be well utilized by existing methods... Experimental results on real-world datasets show that, our method outperforms other baselines on the knowledge graph completion and entity classification tasks, especially under the zero-shot setting, which indicates that **our method is capable of building representations for novel entities according to their descriptions**. [Source code on github](https://github.com/xrb92/DKRL). For fact triples: TransE. Meanwhile, given an entity we will also learn to maximize the likelihood of predicting its description (using either CBOW or CNN encoder) (head + relation = tail, also in "text space") Two types of representations for entities: structure-based representations and description-based representations. They are learned simultaneously into the same vector space but not forced to be unified **so that novel entities with only descriptions can be represented**. 2020-10-02T00:57:11Z SSP: Semantic Space Projection for Knowledge Graph Embedding with Text Descriptions (AAAI 2017) 2020-10-02 > jointly learns from the symbolic triples and textual descriptions > The data involved in our model are the knowledge triples and the textual descriptions of entities. In experiments, we adopt the “entity descriptions” of Freebase and the textual definitions of Wordnet as textual information. Obvious but very good remark about link prediction in facts-only KG: > the triple (Anna Roosevelt, Parents, Franklin Roosevelt), indicates “Franklin Roosevelt” is the parent of “Anna Roosevelt”. However, it’s quite difficult to infer this fact merely from other symbolic triples. 2020-10-14 2020-10-14T00:21:10Z Philippe Aghion — Wikipédia 2020-10-18T23:02:40Z 2020-10-18 anvaka/VivaGraphJS: Graph drawing library for JavaScript How to extract text from PDF files - dida Machine Learning 2020-10-05T09:36:52Z 2020-10-05 2020-10-17 2020-10-17T16:00:56Z Graph visualization library in JavaScript - Stack Overflow Chris Mungall sur Twitter : "Reading: OWL2Vec*: Embedding of OWL Ontologies" 2020-10-07 2020-10-07T08:36:28Z Pre-trained word embeddings encode general word semantics and lexical regularities of natural language, and have proven useful across many NLP tasks, including word sense disambiguation, machine translation, and sentiment analysis, to name a few. In supervised tasks such as multiclass text classification (the focus of this article) it seems appealing to enhance word representations with ad-hoc embeddings that encode task-specific information. We propose (supervised) word-class embeddings (WCEs), and show that, when concatenated to (unsupervised) pre-trained word embeddings, they substantially facilitate the training of deep-learning models in multiclass classification by topic. We show empirical evidence that WCEs yield a consistent improvement in multiclass classification accuracy, using four popular neural architectures and six widely used and publicly available datasets for multiclass text classification. Our code that implements WCEs is publicly available at https://github.com/AlexMoreo/word-class-embeddings Word-Class Embeddings for Multiclass Text Classification 2019-11-26T13:11:00Z Alejandro Moreo > In supervised tasks such as multiclass text classification (the focus of this article) it seems appealing to enhance word representations with ad-hoc embeddings that encode task-specific information. We propose (supervised) word-class embeddings (WCEs), and show that, when concatenated to (unsupervised) pre-trained word embeddings, they substantially facilitate the training of deep-learning models in multiclass classification by topic. > > A differentiating aspect of our method is that it keeps the modelling of word-class interactions separate from the original word embedding. Word-class correlations are confined in a dedicated vector space, whose vectors enhance (by concatenation) the unsupervised representations. The net effect is an embedding matrix that is better suited to classification, and imposes no restriction to the network architecture using it. [github](https://github.com/AlexMoreo/word-class-embeddings). Refers to [LEAM](doc:2020/02/joint_embedding_of_words_and_la) : > [in LEAM] Once words and labels are embedded in a common vector space, word-label compatibility is measured via cosine similarity. Our method instead models these compatibilities directly, without generating intermediate embeddings for words or labels. Alejandro Moreo [1911.11506] Word-Class Embeddings for Multiclass Text Classification 2019-11-26T13:11:00Z Fabrizio Sebastiani 2020-10-11T19:29:28Z 2020-10-11 Andrea Esuli 1911.11506 2020-10-08 Site archéologique de Bura - UNESCO World Heritage Centre 2020-10-08T23:12:39Z Albert Gu 2020-10-01T13:43:19Z From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering > The key idea of our method, HypHC, is showing a direct correspondence from discrete trees to continuous representations (via the hyperbolic embeddings of their leaf nodes) and back (via a decoding algorithm that maps leaf embeddings to a dendrogram), **allowing us to search the space of discrete binary trees with continuous optimization**. Cites [Dasgupta: A cost function for similarity-based hierarchical clustering](https://arxiv.org/abs/1510.05043) 2020-10-01T13:43:19Z Vaggos Chatziafratis [2010.00402] From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering 2010.00402 2020-10-03T14:46:20Z Ines Chami Ines Chami Christopher Ré Similarity-based Hierarchical Clustering (HC) is a classical unsupervised machine learning algorithm that has traditionally been solved with heuristic algorithms like Average-Linkage. Recently, Dasgupta reframed HC as a discrete optimization problem by introducing a global cost function measuring the quality of a given tree. In this work, we provide the first continuous relaxation of Dasgupta's discrete optimization problem with provable quality guarantees. The key idea of our method, HypHC, is showing a direct correspondence from discrete trees to continuous representations (via the hyperbolic embeddings of their leaf nodes) and back (via a decoding algorithm that maps leaf embeddings to a dendrogram), allowing us to search the space of discrete binary trees with continuous optimization. Building on analogies between trees and hyperbolic space, we derive a continuous analogue for the notion of lowest common ancestor, which leads to a continuous relaxation of Dasgupta's discrete objective. We can show that after decoding, the global minimizer of our continuous relaxation yields a discrete tree with a (1 + epsilon)-factor approximation for Dasgupta's optimal tree, where epsilon can be made arbitrarily small and controls optimization challenges. We experimentally evaluate HypHC on a variety of HC benchmarks and find that even approximate solutions found with gradient descent have superior clustering quality than agglomerative heuristics or other gradient based algorithms. Finally, we highlight the flexibility of HypHC using end-to-end training in a downstream classification task. 2020-10-03 Shervin Minaee Erik Cambria Shervin Minaee Nal Kalchbrenner Narjes Nikzad Meysam Chenaghlu [2004.03705] Deep Learning Based Text Classification: A Comprehensive Review 2004.03705 Deep Learning Based Text Classification: A Comprehensive Review 2020-10-11T01:16:13Z Deep learning based models have surpassed classical machine learning based approaches in various text classification tasks, including sentiment analysis, news categorization, question answering, and natural language inference. In this work, we provide a detailed review of more than 150 deep learning based models for text classification developed in recent years, and discuss their technical contributions, similarities, and strengths. We also provide a summary of more than 40 popular datasets widely used for text classification. Finally, we provide a quantitative analysis of the performance of different deep learning models on popular benchmarks, and discuss future research directions. 2020-04-06T02:00:30Z Jianfeng Gao 2020-10-11 2020-04-06T02:00:30Z Casey Lickfold 2020-10-15 2020-10-11T12:36:17Z 2020-10-11T12:36:17Z Graph neural networks (GNNs) have recently grown in popularity in the field of artificial intelligence due to their unique ability to ingest relatively unstructured data types as input data. Although some elements of the GNN architecture are conceptually similar in operation to traditional neural networks (and neural network variants), other elements represent a departure from traditional deep learning techniques. This tutorial exposes the power and novelty of GNNs to the average deep learning enthusiast by collating and presenting details on the motivations, concepts, mathematics, and applications of the most common types of GNNs. Importantly, we present this tutorial concisely, alongside worked code examples, and at an introductory pace, thus providing a practical and accessible guide to understanding and using GNNs. 2020-10-15T00:07:48Z Stash Rowe Jack Joyner Isaac Ronald Ward [2010.05234] A Practical Guide to Graph Neural Networks Isaac Ronald Ward Mohammed Bennamoun 2010.05234 Yulan Guo A Practical Guide to Graph Neural Networks 2020-10-19 2020-10-19T18:44:17Z Sylvain Gugger sur Twitter : "Training a transformer model for text classification..." 2020-10-30T23:06:35Z 2020-10-30 Knowledge Graphs: An Information Retrieval Perspective Gregory Benton Marc Finzi Gregory Benton 2020-10-22T17:18:48Z 2020-10-22T17:18:48Z Andrew Gordon Wilson [2010.11882] Learning Invariances in Neural Networks Learning Invariances in Neural Networks Pavel Izmailov 2010.11882 2020-10-25 2020-10-25T12:38:17Z how to *learn* symmetries -- rotations, translations, scalings, shears -- from training data alone Invariances to translations have imbued convolutional neural networks with powerful generalization properties. However, we often do not know a priori what invariances are present in the data, or to what extent a model should be invariant to a given symmetry group. We show how to \emph{learn} invariances and equivariances by parameterizing a distribution over augmentations and optimizing the training loss simultaneously with respect to the network parameters and augmentation parameters. With this simple procedure we can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations, on training data alone. [paper](https://arxiv.org/abs/2010.08895) 2020-10-31 2020-10-31T12:32:37Z AI has cracked a key mathematical puzzle for understanding our world | MIT Technology Review 2018-05-21T03:44:48Z 2018-02-16T13:38:00Z Ambedkar Dukkipati > we propose to enhance learning models with world knowledge in the form of **Knowledge Graph fact triples for NLP tasks**. Our aim is to develop a deep learning model that can extract relevant prior support facts from knowledge graphs depending on the task using attention mechanism. Related [blog post](https://medium.com/@anshumanmourya/learning-beyond-datasets-knowledge-graph-augmented-neural-networks-for-natural-language-b937ba49f2e5) [1802.05930] Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing K M Annervaz Machine Learning has been the quintessential solution for many AI problems, but learning is still heavily dependent on the specific training data. Some learning models can be incorporated with a prior knowledge in the Bayesian set up, but these learning models do not have the ability to access any organised world knowledge on demand. In this work, we propose to enhance learning models with world knowledge in the form of Knowledge Graph (KG) fact triples for Natural Language Processing (NLP) tasks. Our aim is to develop a deep learning model that can extract relevant prior support facts from knowledge graphs depending on the task using attention mechanism. We introduce a convolution-based model for learning representations of knowledge graph entity and relation clusters in order to reduce the attention space. We show that the proposed method is highly scalable to the amount of prior information that has to be processed and can be applied to any generic NLP task. Using this method we show significant improvement in performance for text classification with News20, DBPedia datasets and natural language inference with Stanford Natural Language Inference (SNLI) dataset. We also demonstrate that a deep learning model can be trained well with substantially less amount of labeled training data, when it has access to organised world knowledge in the form of knowledge graph. Somnath Basu Roy Chowdhury 1802.05930 Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing 2020-10-02T01:01:15Z 2020-10-02 K M Annervaz > « Le gouvernement était paumé, accablé d’incertitudes, rien ne marchait » > Plusieurs élus s’écharpent avec les autorités gouvernementales à ce sujet : ils veulent tester les personnels asymptomatiques pour protéger les résidents, mais l’Etat s’y oppose. « Le combat a duré quinze jours, raconte Anne Hidalgo. C’est le pire souvenir de ma vie politique, un bras de fer harassant. **Nous nous sommes heurtés à une bureaucratie délirante qui, de façon fanatique, conçoit son rôle comme celui de producteur de normes.** » 2020-10-16 2020-10-16T19:44:05Z « Nous avons assisté à l’effondrement de l’Etat » : des maires de grandes villes racontent les premiers mois de la pandémie - par Vanessa Schneider (Le Monde) 2020-10-31T00:54:14Z 2020-10-31 librairies indépendantes 2020-10-16T10:04:20Z Teosinte, a wild relative of maize originating from Mexico recently emerged as an invasive weed in Europe. Evolution of a crop’s wild relative into a weed that includes an herbicide resistance gene 2020-10-16 Sami Abu-El-Haija Ines Chami Bryan Perozzi There has been a surge of recent interest in learning representations for graph-structured data. Graph representation learning methods have generally fallen into three main categories, based on the availability of labeled data. The first, network embedding (such as shallow graph embedding or graph auto-encoders), focuses on learning unsupervised representations of relational structure. The second, graph regularized neural networks, leverages graphs to augment neural network losses with a regularization objective for semi-supervised learning. The third, graph neural networks, aims to learn differentiable functions over discrete topologies with arbitrary structure. However, despite the popularity of these areas there has been surprisingly little work on unifying the three paradigms. Here, we aim to bridge the gap between graph neural networks, network embedding and graph regularization models. We propose a comprehensive taxonomy of representation learning methods for graph-structured data, aiming to unify several disparate bodies of work. Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which generalizes popular algorithms for semi-supervised learning on graphs (e.g. GraphSage, Graph Convolutional Networks, Graph Attention Networks), and unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc) into a single consistent approach. To illustrate the generality of this approach, we fit over thirty existing methods into this framework. We believe that this unifying view both provides a solid foundation for understanding the intuition behind these methods, and enables future research in the area. Machine Learning on Graphs: A Model and Comprehensive Taxonomy 2020-10-03 Kevin Murphy [2005.03675] Machine Learning on Graphs: A Model and Comprehensive Taxonomy 2005.03675 2020-10-03T15:14:22Z Christopher Ré > we aim to **bridge the gap between graph neural networks, network embedding and graph regularization models**. We propose a comprehensive taxonomy of representation learning methods for graph-structured data, aiming to unify several disparate bodies of work. Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which generalizes popular algorithms for semi-supervised learning on graphs (e.g. GraphSage, Graph Convolutional Networks, Graph Attention Networks), and unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc) into a single consistent approach. 2020-05-07T18:00:02Z 2020-05-07T18:00:02Z Ines Chami 2020-10-26T17:10:56Z Chenguang Wang 2010.11967 Chenguang Wang Language Models are Open Knowledge Graphs Dawn Song This paper shows how to construct knowledge graphs (KGs) from pre-trained language models (e.g., BERT, GPT-2/3), without human supervision. Popular KGs (e.g, Wikidata, NELL) are built in either a supervised or semi-supervised manner, requiring humans to create knowledge. Recent deep language models automatically acquire knowledge from large-scale corpora via pre-training. The stored knowledge has enabled the language models to improve downstream NLP tasks, e.g., answering questions, and writing code and articles. In this paper, we propose an unsupervised method to cast the knowledge contained within language models into KGs. We show that KGs are constructed with a single forward pass of the pre-trained language models (without fine-tuning) over the corpora. We demonstrate the quality of the constructed KGs by comparing to two KGs (Wikidata, TAC KBP) created by humans. Our KGs also provide open factual knowledge that is new in the existing KGs. Our code and KGs will be made publicly available. 2020-10-26 [2010.11967] Language Models are Open Knowledge Graphs Xiao Liu 2020-10-22T18:01:56Z 2020-10-22T18:01:56Z 2020-10-10 Thomas Piketty : « Que faire de la dette Covid-19 ? » 2020-10-10T14:36:49Z 2020-10-22 Building a Faster and Accurate Search Engine on Custom Dataset with Transformers 🤗 | by Shivanand Roy | Analytics Vidhya | Sep, 2020 | Medium 2020-10-22T11:17:43Z