Transformer (deep learning)

In deep learning, the transformer is a family of artificial neural network architectures based on the multi-head attention mechanism, in which text is converted to numerical representations called tokens, and each token is converted into a vector via lookup from a word embedding table. At each layer, each token is then contextualized within the scope of the context window with other (unmasked) tokens via a parallel multi-head attention mechanism, allowing the signal for key tokens to be amplified and less important tokens to be diminished.

This neuron ends here.

Source: Wikipedia "Transformer (deep learning)" · CC BY-SA 4.0

Share this article: X · Bluesky

Share: X · BlueskyPrivacy Policy