Language Modeling with ``nn.Transformer`` and torchtext =============================================================== This is a tutorial on training a model to predict the next word in a sequence using the `nn.Transformer <https://pytorch.org/docs/stable/generated/torch.nn.Transformer.html>`__ module. The PyTorch 1.2 release includes a standard transformer module based on the paper `Attention is All You Need <https://arxiv.org/pdf/1706.03762.pdf>`__. Compared to Recurrent Neural Networks (RNNs), the transformer model has proven to be superior in quality for many sequence-to-sequence tasks while being more parallelizable. The ``nn.Transformer`` module relies entirely on an attention mechanism (implemented as `nn.MultiheadAttention <https://pytorch.org/docs/stable/generated/torch.nn.MultiheadAttention.html>`__) to draw global dependencies between input and output. The ``nn.Transformer`` module is highly modularized such that a single component (e.g., `nn.TransformerEncoder <https://pytorch.org/docs/stable/generated/torch.nn.TransformerEncoder.html>`__) can be easily adapted/composed.
Tasks: Text Generation, Deep Learning Fundamentals
Task Categories: Natural Language Processing, Deep Learning Fundamentals