"Long short-term memory": recurrent neural network architecture well-suited for **time series with long time lags between important events**.
(cf the problem of long time dependencies, such as when you want to predict the next word in "I grew up in France… I speak fluent [?]").
A solution to the vanishing gradient problem in RNNs
The Unreasonable Effectiveness of Recurrent Neural Networks What character-level language models based on RNNs are capable of
> The core reason that recurrent nets are more exciting is that they allow us to operate over sequences of vectors: Sequences in the input, the output, or in the most general case both.
Recurrent Memory Network for Language Modeling Recurrent Neural Networks (RNN) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge.
In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allows us to discover underlying patterns in data.
We demonstrate the power of RMN on language modeling and sentence completion tasks.
On language modeling, RMN outperforms Long Short-Term Memory (LSTM) network on three large German, Italian, and English dataset. Additionally we perform in-depth analysis of various linguistic dimensions that RMN captures. On Sentence Completion Challenge, for which it is essential to capture sentence coherence, our RMN obtains 69.2% accuracy, surpassing the previous state-of-the-art by a large margin.