Hasty Briefsbeta

Bilingual

Transformers Are Graph Neural Networks

10 months ago
  • #Transformers
  • #Graph Neural Networks
  • #Machine Learning
  • Transformers can be viewed as message passing Graph Neural Networks (GNNs) operating on fully connected graphs of tokens.
  • Self-attention mechanisms in Transformers capture the relative importance of tokens, while positional encodings provide hints about sequential ordering or structure.
  • Transformers are expressive set processing networks that learn relationships among input elements without being constrained by apriori graphs.
  • Despite their mathematical connection to GNNs, Transformers are implemented via dense matrix operations, making them more efficient on modern hardware than sparse message passing.
  • The perspective is presented that Transformers are GNNs currently benefiting from the 'hardware lottery' due to their efficient implementation.