About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Jack W. Rae
- sl:arxiv_num : 1911.05507
- sl:arxiv_published : 2019-11-13T14:36:01Z
- sl:arxiv_summary : We present the Compressive Transformer, an attentive sequence model which
compresses past memories for long-range sequence learning. We find the
Compressive Transformer obtains state-of-the-art language modelling results in
the WikiText-103 and Enwik8 benchmarks, achieving 17.1 ppl and 0.97 bpc
respectively. We also find it can model high-frequency speech effectively and
can be used as a memory mechanism for RL, demonstrated on an object matching
task. To promote the domain of long-range sequence learning, we propose a new
open-vocabulary language modelling benchmark derived from books, PG-19.@en
- sl:arxiv_title : Compressive Transformers for Long-Range Sequence Modelling@en
- sl:arxiv_updated : 2019-11-13T14:36:01Z
- sl:bookmarkOf : https://arxiv.org/abs/1911.05507
- sl:creationDate : 2020-02-11
- sl:creationTime : 2020-02-11T08:48:20Z
- sl:relatedDoc : http://www.semanlink.net/doc/2020/02/a_new_model_and_dataset_for_lon
Documents with similar tags (experimental)