Transformer

Transformer#

Introduzione#

Transformer Architecture
Encoder
Input Embedding
+ Positional Encoding
Multi-Head Attention
Add & Norm
Feed Forward
Add & Norm
Decoder
Output Embedding (shifted right)
+ Positional Encoding
Masked Multi-Head Attention
Add & Norm
Multi-Head Attention
Add & Norm
Feed Forward
Add & Norm
Linear
Softmax