Sequence-to-sequence learning
Training models to convert sequences from one domain (e.g. sentences in English) to sequences in another domain (e.g. the same sentences translated to French). Example of transformation: translation from one language to another one (text or audio), QA answering, parsing sentences into grammar tree. The seq2seq model generally uses an encoder-decoder architecture, where both encoder and decoder are RNN: - the encoder encodes the input as a fixed length vector (the "context vector") - the decoder is initialized with the context vector to emit the output Problems: - fixed-length context vector is unable to remember long sentences. [#Attention mechanism](/tag/deep_learning_attention) allows to solve this problem - since RNN-based seq2seq model are sequential models, they cannot be parallelized. [#The Transformer](/tag/attention_is_all_you_need) solves this
5 Documents (Long List