From Pattern Recognition to Reasoning: A Survey for Transformer Generalization and Recall in Algorithmic Tasks
byD. Verșebeniuc
supervised byA. Härmä
Abstract
This paper surveys recent advances in enhancing the generalization capabilities of Transformer models in algorithmic tasks, marking a shift from pattern recognition to algorithmic reasoning. We introduce a comprehensive taxonomy that differentiates between length generalization - the ability to extend fixed computational procedures to longer inputs – and compositional generalization, where models dynamically assemble learned subroutines to solve more complex problems. Our review covers key techniques that improve Transformer performance on arithmetic tasks, such as integer addition, by leveraging innovative data formatting strategies (including reversed formats, index hints, random space augmentation, and zero-padding) alongside advanced positional encoding methods (e.g., absolute, additive relative, position coupling, and randomized encodings). At the end, this paper outlines the present challenges and highlights possible future directions for research and development.