Transformer | Model Monster

The neural network architecture underlying modern LLMs and most other frontier AI systems. Transformers process input by converting it to tokens, then using attention mechanisms to determine which parts of the input are relevant to each other. This architecture enables models to handle long-range dependencies in text, such as understanding that a pronoun in one sentence refers to a noun several paragraphs earlier. "Transformer-based" in vendor materials signals a model with LLM-like capabilities.