[Seminar] Deep Latent Variable Models of Natural Language(About) Both GANs and VAEs have been remarkably effective at modeling images, and the learned latent representations often correspond to interesting, semantically-meaningful representations of the observed data. In contrast, GANs and VAEs have been less successful at modeling natural language, but for different reasons.
- GANs have difficulty dealing with discrete output spaces (such as natural language) as the resulting objective is no longer differentiable with respect to the generator.
- VAEs can deal with discrete output spaces, but when a powerful model (e.g. LSTM) is used as a generator, the model learns to ignore the latent variable and simply becomes a language model.
4 Approaches To Natural Language Processing & Understanding(About) The antithesis of grounded language is inferred language. Inferred language derives meaning from words themselves rather than what they represent. When trained only on large corpuses of text, but not on real-world representations, statistical methods for NLP and NLU lack true understanding of what words mean
When trained only on large corpuses of text, but not on real-world representations, statistical methods for NLP and NLU lack true understanding of what words mean
Natural Language Processing (almost) from Scratch - Collobert and Weston (2011)(About) seminal work
> a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements
Towards a Seamless Integration of Word Senses into Downstream NLP Applications (2017)(About) By incorporating a novel disambiguation algorithm into a state-of-the-art classification model, we create a pipeline to integrate sense-level information into downstream NLP applications. We show that a simple disambiguation of the input text can lead to consistent performance improvement on multiple topic categorization and polarity detection datasets, particularly when the fine granularity of the underlying sense inventory is reduced and the document is sufficiently large.
Our results suggest that research in sense representation should put special emphasis on real-world evaluations on benchmarks for downstream applications, rather than on artificial tasks such as word similarity. In fact, research has previously shown that **word similarity might not constitute a reliable proxy to measure the performance of word embeddings in downstream applications**
Learned in translation: contextualized word vectors (Salesforce Research)(About) Models that use pretrained word vectors must learn how to use them. Our work picks up where word vectors left off by looking to improve over randomly initialized methods for contextualizing word vectors through training on an intermediate task -> We teach a neural network how to understand words in context by first teaching it how to translate English to German
Recurrent Memory Network for Language Modeling(About) Recurrent Neural Networks (RNN) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge.
In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allows us to discover underlying patterns in data.
We demonstrate the power of RMN on language modeling and sentence completion tasks.
On language modeling, RMN outperforms Long Short-Term Memory (LSTM) network on three large German, Italian, and English dataset. Additionally we perform in-depth analysis of various linguistic dimensions that RMN captures. On Sentence Completion Challenge, for which it is essential to capture sentence coherence, our RMN obtains 69.2% accuracy, surpassing the previous state-of-the-art by a large margin.
Semantic Search Arrives at the Web(About) There are two approaches toward semantic search and both have received attention in the past months. The first approach builds on the automatic analysis of text using Natural Language Processing (NLP). The second approach uses semantic web technologies, which aims to make the web more easily searchable by allowing publishers to expose their (meta)data.