刘昊辰 (2025-08-19 13:25):
#paper Search-contempt a hybrid MCTS algorithm for training AlphaZero-like engines with better computational efficiency提出search-contempt,一种结合PUCT与Thompson Sampling(TS) 的混合 MCTS 算法,通过新参数Nscl调控自对弈中生成的棋局分布,偏好 “挑战性” 局面。在常规国际象棋中,其生成的训练棋局质量更高,使引擎强度提升约70 Elo,且训练所需棋局数量从数千万减少至数十万,计算成本从数千万美元降至数万美元;在Odds Chess(一方开局劣势)中,强度提升约150 Elo,同时增强系统对抗鲁棒性,有望在消费级 GPU 上实现从零训练。下载地址:https://arxiv.org/pdf/2504.07757
arXiv, 10 Apr 2025.
Search-contempt: a hybrid MCTS algorithm for training AlphaZero-like engines with better computational efficiency
翻译
Abstract: No abstract available.
回到顶部