Training models to convert sequences from one domain (e.g. sentences in English) to sequences in another domain (e.g. the same sentences translated to French).
Example of transformation: translation from one language to another one (text or audio), QA answering, parsing sentences into grammar tree.
The seq2seq model generally uses an encoder-decoder architecture, where both encoder and decoder are RNN:
- the encoder encodes the input as a fixed length vector (the "context vector")
- the decoder is initialized with the context vector to emit the output
- fixed-length context vector is unable to remember long sentences. [#Attention mechanism](/tag/deep_learning_attention) allows to solve this problem
- since RNN-based seq2seq model are sequential models, they cannot be parallelized. [#The Transformer](/tag/attention_is_all_you_need) solves this
Sequence Modeling with CTC(About) A visual guide to Connectionist Temporal Classiﬁcation, an algorithm used to train deep neural networks in speech recognition, handwriting recognition and other sequence problems.