刘昊辰 (2024-10-12 10:09):
#paper arXiv:2409.12272v1 [cs.LG] 18 Sep 2024, Mastering Chess with a Transformer Model. 这是一篇关于Transformer模型在国际象棋中的应用的研究论文。论文证明了Transformer在国际象棋中的有效性在很大程度上取决于注意力机制中位置编码的选择。基于这一观察,论文采用了Shaw等人的通用位置编码方案,并大规模地训练了具有这种技术和其他增强功能的模型,将得到的架构称为ChessFormer。这种架构在对弈实力和解谜能力方面显著优于先前的工作,且计算成本只是其一小部分。下载地址:https://arxiv.org/pdf/2409.12272
arXiv, 2024-09-18T19:05:21Z. DOI: 10.48550/arXiv.2409.12272
Mastering Chess with a Transformer Model
翻译
Abstract:
Transformer models have demonstrated impressive capabilities when trained atscale, excelling at difficult cognitive tasks requiring complex reasoning andrational decision-making. In this paper, we explore the application oftransformer models to chess, focusing on the critical role of the positionencoding within the attention mechanism. We show that in chess, transformersendowed with a sufficiently versatile position encoding can match existingchess-playing models at a fraction of the computational cost. Our architecturesignificantly outperforms AlphaZero at 8x fewer FLOPS and matches priorgrandmaster-level transformer-based agents at 30x fewer FLOPS.
翻译
回到顶部