来自杂志 arXiv 的文献。
当前共找到 152 篇文献分享,本页显示第 1 - 20 篇。
1.
Vincent (2025-11-30 21:07):
#paper https://arxiv.org/abs/2104.09864 Arxiv. 2021. RoFormer: Enhanced Transformer with Rotary Position Embedding 这篇论文提出 RoFormer,一种通过旋转式位置编码(Rotary Position Embedding, RoPE)增强 Transformer 推理能力的新方法。传统 Transformer 需要依赖绝对或相对位置向量“相加”到 token 表示中,而 RoPE 另辟蹊径,通过对 query 与 key 施加与位置相关的旋转变换,使自注意力在点积阶段自然地体现相对位置信息。该方法在数学上更优雅、在实现上轻量,并具备更好的长程依赖建模能力,同时与线性注意力等高效变体完全兼容。实验结果显示,RoFormer 在多个长文本任务上均显著优于传统位置编码方案,不需要额外训练成本却能带来更强表示能力,展示出其在更大规模语言模型与复杂序列任务中的广泛应用潜力。
arXiv, 2021-04-20T09:54:06Z. DOI: 10.48550/arXiv.2104.09864
Abstract:
Position encoding recently has shown effective in the transformer architecture. It enables valuable supervision for dependency modeling between elements at different positions of the sequence. In this paper, we first … >>>
Position encoding recently has shown effective in the transformer architecture. It enables valuable supervision for dependency modeling between elements at different positions of the sequence. In this paper, we first investigate various methods to integrate positional information into the learning process of transformer-based language models. Then, we propose a novel method named Rotary Position Embedding(RoPE) to effectively leverage the positional information. Specifically, the proposed RoPE encodes the absolute position with a rotation matrix and meanwhile incorporates the explicit relative position dependency in self-attention formulation. Notably, RoPE enables valuable properties, including the flexibility of sequence length, decaying inter-token dependency with increasing relative distances, and the capability of equipping the linear self-attention with relative position encoding. Finally, we evaluate the enhanced transformer with rotary position embedding, also called RoFormer, on various long text classification benchmark datasets. Our experiments show that it consistently overcomes its alternatives. Furthermore, we provide a theoretical analysis to explain some experimental results. RoFormer is already integrated into Huggingface: \url{https://huggingface.co/docs/transformers/model_doc/roformer}. <<<
翻译
2.
符毓 (2025-11-28 00:14):
#paper doi: 10.48550/arXiv.2511.21366, 2025, THybrid Control for Robotic Nut Tightening Task 本文所提出的机器人螺母紧固系统由两部分组成:1是基于运动基元的规划框架,该框架在任务空间中运行;2是混合控制器,该控制器利用感知到的交互力来更高效地执行规划轨迹中接触密集的部分。实验评估表明,与基准系统相比,该系统完成目标的速度提高了 14.5%,同时由于施加在机械臂上的接触力比基准系统小两个数量级,因此更加安全高效。 所提出系统的规划和控制组件的计算成本都很低,与运行它们的仿真软件相比,消耗的 CPU 资源可以忽略不计。 该系统对初始配置的变化表现出很高的鲁棒性,并指明了进一步改进的方向。目前存在的一个鲁棒性瓶颈在于规划框架中的回缩运动基元。规划和控制之间更紧密的耦合将缓解问题。
arXiv, 2025/11/26. DOI: 10.48550/arXiv.2511.21366
Abstract:
An autonomous robotic nut tightening system for a serial manipulator equipped with a parallel gripper is proposed. The system features a hierarchical motion-primitive-based planner and a control-switching scheme that alternates … >>>
An autonomous robotic nut tightening system for a serial manipulator equipped with a parallel gripper is proposed. The system features a hierarchical motion-primitive-based planner and a control-switching scheme that alternates between force and position control. Extensive simulations demonstrate the system's robustness to variance in initial conditions. Additionally, the proposed controller tightens threaded screws 14% faster than the baseline while applying 40 times less contact force on manipulands. For the benefit of the research community, the system's implementation is open-sourced. <<<
翻译
3.
刘昊辰 (2025-11-01 14:44):
#paper Generating Creative Chess Puzzles. Google DeepMind 于 2025 年 10 月提出一种生成创意国际象棋谜题的方法,先通过基准测试多种生成式 AI 架构(如自回归 Transformer、潜在扩散模型等),再引入基于国际象棋引擎搜索统计数据的强化学习(RL)框架,设计奖励函数提升谜题的独特性、反直觉性、多样性和真实性;该 RL 方法使反直觉谜题生成率从监督学习的 0.22% 提升 10 倍至 2.5%,超过现有数据集(2.1%)和最佳 Lichess 训练模型(0.4%),生成的谜题在新颖性和多样性上达标且保留美学主题,经人类专家评估,其创意性、趣味性和反直觉性优于书籍谜题,最终形成的精选谜题手册获三位世界知名专家认可。下载地址:https://arxiv.org/pdf/2510.23881
arXiv, 2025-10-27T21:43:39Z. DOI: 10.48550/arXiv.2510.23881
Abstract:
While Generative AI rapidly advances in various domains, generating truly creative, aesthetic, and counter-intuitive outputs remains a challenge. This paper presents an approach to tackle these difficulties in the domain … >>>
While Generative AI rapidly advances in various domains, generating truly creative, aesthetic, and counter-intuitive outputs remains a challenge. This paper presents an approach to tackle these difficulties in the domain of chess puzzles. We start by benchmarking Generative AI architectures, and then introduce an RL framework with novel rewards based on chess engine search statistics to overcome some of those shortcomings. The rewards are designed to enhance a puzzle's uniqueness, counter-intuitiveness, diversity, and realism. Our RL approach dramatically increases counter-intuitive puzzle generation by 10x, from 0.22\% (supervised) to 2.5\%, surpassing existing dataset rates (2.1\%) and the best Lichess-trained model (0.4\%). Our puzzles meet novelty and diversity benchmarks, retain aesthetic themes, and are rated by human experts as more creative, enjoyable, and counter-intuitive than composed book puzzles, even approaching classic compositions. Our final outcome is a curated booklet of these AI-generated puzzles, which is acknowledged for creativity by three world-renowned experts. <<<
翻译
4.
林海onrush (2025-10-31 23:18):
#paper, PALQO: Physics-informed Model for Accelerating Large-scale Quantum Optimization,DOI:10.48550/arXiv.2509.20733。这篇论文提出了 PALQO,一种基于物理约束神经网络(PINN)的新方法用于加速大规模变分量子算法(VQAs)的训练。作者将 VQA 的参数更新过程重新表述为非线性偏微分方程(PDE)问题,并利用 PINN 在经典计算机上学习优化动力学,仅需少量量子测量数据即可预测后续参数更新,从而显著减少量子设备调用。理论分析表明,PALQO 具有良好的泛化性能,其所需训练样本数量随参数规模多项式增长。 在横场 Ising 模型、Heisenberg 模型及分子体系(如 LiH、BeH₂)等任务上的实验显示,PALQO 能在保持能量精度(误差约 (10^{-3}))的同时,将量子测量开销降低约90%,实现最高30倍加速。该方法在多体系统和量子化学计算中表现出良好的可扩展性,为在受限量子资源条件下推进大规模量子优化提供了新的思路。
arXiv, 2025-09-25T04:26:02Z. DOI: 10.48550/arXiv.2509.20733
Abstract:
Variational quantum algorithms (VQAs) are leading strategies to reachpractical utilities of near-term quantum devices. However, the no-cloningtheorem in quantum mechanics precludes standard backpropagation, leading toprohibitive quantum resource costs when applying … >>>
Variational quantum algorithms (VQAs) are leading strategies to reachpractical utilities of near-term quantum devices. However, the no-cloningtheorem in quantum mechanics precludes standard backpropagation, leading toprohibitive quantum resource costs when applying VQAs to large-scale tasks. Toaddress this challenge, we reformulate the training dynamics of VQAs as anonlinear partial differential equation and propose a novel protocol thatleverages physics-informed neural networks (PINNs) to model this dynamicalsystem efficiently. Given a small amount of training trajectory data collectedfrom quantum devices, our protocol predicts the parameter updates of VQAs overmultiple iterations on the classical side, dramatically reducing quantumresource costs. Through systematic numerical experiments, we demonstrate thatour method achieves up to a 30x speedup compared to conventional methods andreduces quantum resource costs by as much as 90\% for tasks involving up to 40qubits, including ground state preparation of different quantum systems, whilemaintaining competitive accuracy. Our approach complements existing techniquesaimed at improving the efficiency of VQAs and further strengthens theirpotential for practical applications. <<<
翻译
5.
符毓 (2025-10-31 22:50):
#paper doi: 10.48550/arXiv.2510.10903, 2025, Towards a Unified Understanding of Robot Manipulation: A Comprehensive Survey 一篇全面涵盖机器人操作领域的全景视角综述。超 1000 篇参考系统地梳理了机器人操作领域的全景图谱,涵盖硬件与控制基础、任务与数据体系、高低层控制框架,以及跨本体与跨模态的泛化研究,并提出了一个统一的理解框架,揭示机器人如何从“执行任务”走向“理解与学习任务”。
arXiv, 2025-10-13T01:59:27Z. DOI: 10.48550/arXiv.2510.10903
Abstract:
Embodied intelligence has witnessed remarkable progress in recent years,driven by advances in computer vision, natural language processing, and therise of large-scale multimodal models. Among its core challenges, robotmanipulation stands out … >>>
Embodied intelligence has witnessed remarkable progress in recent years,driven by advances in computer vision, natural language processing, and therise of large-scale multimodal models. Among its core challenges, robotmanipulation stands out as a fundamental yet intricate problem, requiring theseamless integration of perception, planning, and control to enable interactionwithin diverse and unstructured environments. This survey presents acomprehensive overview of robotic manipulation, encompassing foundationalbackground, task-organized benchmarks and datasets, and a unified taxonomy ofexisting methods. We extend the classical division between high-level planningand low-level control by broadening high-level planning to include language,code, motion, affordance, and 3D representations, while introducing a newtaxonomy of low-level learning-based control grounded in training paradigmssuch as input modeling, latent learning, and policy learning. Furthermore, weprovide the first dedicated taxonomy of key bottlenecks, focusing on datacollection, utilization, and generalization, and conclude with an extensivereview of real-world applications. Compared with prior surveys, our work offersboth a broader scope and deeper insight, serving as an accessible roadmap fornewcomers and a structured reference for experienced researchers. All relatedresources, including research papers, open-source datasets, and projects, arecurated for the community athttps://github.com/BaiShuanghao/Awesome-Robotics-Manipulation. <<<
翻译
6.
Vincent (2025-10-31 16:28):
#paper https://doi.org/10.48550/arXiv.2510.14901 Arxiv. 2025. Reasoning with Sampling: Your Base Model is Smarter Than You Think. 大语言模型(LLM)+ 强化学习(RL)在众多领域展现出了强大的推理能力,以往研究多集中于探讨强化学习如何赋予基础模型其原本不具备的能力。这篇文章另辟蹊径,提出一个发人深省的问题:是否仅通过采样,而非额外训练,就能让基础模型展现出与强化学习策略相当的推理能力?这篇文章基于模型自身的似然值,提出了一种简单的基于马尔可夫蒙特卡罗(MCMC)的迭代采样方法。实验结果显示,该方法在多种基础模型上均取得了与强化学习算法相当甚至更优的表现。更为重要的是,这一方法避免了强化学习中常见的多样性缺失问题,且无需额外数据或者训练,展现出其在不同领域中的广泛应用潜力
arXiv, 2025-10-16T17:18:11Z. DOI: 10.48550/arXiv.2510.14901
Abstract:
Frontier reasoning models have exhibited incredible capabilities across awide array of disciplines, driven by posttraining large language models (LLMs)with reinforcement learning (RL). However, despite the widespread success ofthis paradigm, much … >>>
Frontier reasoning models have exhibited incredible capabilities across awide array of disciplines, driven by posttraining large language models (LLMs)with reinforcement learning (RL). However, despite the widespread success ofthis paradigm, much of the literature has been devoted to disentangling trulynovel behaviors that emerge during RL but are not present in the base models.In our work, we approach this question from a different angle, instead askingwhether comparable reasoning capabilites can be elicited from base models atinference time by pure sampling, without any additional training. Inspired byMarkov chain Monte Carlo (MCMC) techniques for sampling from sharpeneddistributions, we propose a simple iterative sampling algorithm leveraging thebase models' own likelihoods. Over different base models, we show that ouralgorithm offers substantial boosts in reasoning that nearly match and evenoutperform those from RL on a wide variety of single-shot tasks, includingMATH500, HumanEval, and GPQA. Moreover, our sampler avoids the collapse indiversity over multiple samples that is characteristic of RL-posttraining.Crucially, our method does not require training, curated datasets, or averifier, suggesting broad applicability beyond easily verifiable domains. <<<
翻译
7.
刘昊辰 (2025-10-27 14:21):
#paper Strongly Solving 2048 4×3. 本文由日本东京大学研究者提出,成功强解了 2048 游戏的 4×3 变体(2048₄ₓ₃),核心关键技术是基于 ”年龄(age)”(定义为棋盘上所有方块数字之和)对状态空间进行划分 —— 状态与后续动作后的过渡态(afterstate)年龄保持不变,过渡态到新状态时年龄因新增方块(2 或 4)增加 2 或 4,据此可分阶段枚举状态并控制内存占用;同时采用Elias-Fano 编码实现状态的紧凑存储,将约 4.4TiB 的原始存储需求压缩至 1.4TiB(最优玩法专用存储仅需 300GiB)。研究结果显示,最常见初始状态(两个 2 方块,年龄 4)的最优策略期望得分为50724.26,可到达状态数与过渡态数分别为1.15×10¹²和7.40×10¹¹,且验证了 “生成大数字方块(如 2048)时难度显著提升” 等玩家直觉。下载地址:https://arxiv.org/pdf/2510.04580
arXiv, 2025-10-06T08:31:59Z. DOI: 10.48550/arXiv.2510.04580
Abstract:
2048 is a stochastic single-player game involving 16 cells on a 4 by 4 grid,where a player chooses a direction among up, down, left, and right to obtain ascore by … >>>
2048 is a stochastic single-player game involving 16 cells on a 4 by 4 grid,where a player chooses a direction among up, down, left, and right to obtain ascore by merging two tiles with the same number located in neighboring cellsalong the chosen direction. This paper presents that a variant 2048-4x3 12cells on a 4 by 3 board, one row smaller than the original, has been stronglysolved. In this variant, the expected score achieved by an optimal strategy isabout $50724.26$ for the most common initial states: ones with two tiles ofnumber 2. The numbers of reachable states and afterstates are identified to be$1,152,817,492,752$ and $739,648,886,170$, respectively. The key technique isto partition state space by the sum of tile numbers on a board, which we callthe age of a state. An age is invariant between a state and its successiveafterstate after any valid action and is increased two or four by stochasticresponse from the environment. Therefore, we can partition state space by agesand enumerate all (after)states of an age depending only on states with therecent ages. Similarly, we can identify (after)state values by going along withages in decreasing order. <<<
翻译
8.
符毓 (2025-09-30 23:42):
#paper doi: 10.48550/arXiv.2509.13311, 2025, Towards General Agentic Intelligence via Environment Scaling. 以往训练这类“代理智能”的主要瓶颈在于缺乏高质量、大规模、多样化的交互数据。人工标注成本极高,而单纯用模型生成的数据又往往不够真实或难以验证。这篇由阿里巴巴通义实验室团队发表的论文(通过环境扩展迈向通用代理智能)提出了一条全新的路径:通过程序化、自动化地构建海量、异构、可验证的模拟环境,让语言模型能在其中自主交互、收集经验、学习成长。基于该方法训练的AgentScaler模型系列,仅用数十亿参数就在多项权威测试中达到了与万亿级模型或闭源商业系统媲美的性能,为高效、轻量级代理智能的发展打开了新的可能性。
arXiv, 2025-09-16T17:57:20Z. DOI: 10.48550/arXiv.2509.13311
Abstract:
Advanced agentic intelligence is a prerequisite for deploying Large LanguageModels in practical, real-world applications. Diverse real-world APIs demandprecise, robust function-calling intelligence, which needs agents to developthese capabilities through interaction in … >>>
Advanced agentic intelligence is a prerequisite for deploying Large LanguageModels in practical, real-world applications. Diverse real-world APIs demandprecise, robust function-calling intelligence, which needs agents to developthese capabilities through interaction in varied environments. The breadth offunction-calling competence is closely tied to the diversity of environments inwhich agents are trained. In this work, we scale up environments as a steptowards advancing general agentic intelligence. This gives rise to two centralchallenges: (i) how to scale environments in a principled manner, and (ii) howto effectively train agentic capabilities from experiences derived throughinteractions with these environments. To address these, we design a scalableframework that automatically constructs heterogeneous environments that arefully simulated, systematically broadening the space of function-callingscenarios. We further adapt a two-phase agent fine-tuning strategy: firstendowing agents with fundamental agentic capabilities, then specializing themfor domain-specific contexts. Extensive experiments on agentic benchmarks,tau-bench, tau2-Bench, and ACEBench, demonstrate that our trained model,AgentScaler, significantly enhances the function-calling capability of models. <<<
翻译
9.
尹志 (2025-09-30 22:39):
#paper Quantum computing and artificial intelligence: status and perspectives. doi: 10.48550/arXiv.2505.23860 比较新的一篇QAI的综述。比较细致的介绍了Quantum for AI及AI for Quantum,还有基础问题。最后介绍了一些目前这个领域所遇到的挑战。有两个特点值得一提,一个就是确实很新,目前基本的QAI的问题都有涉及;第二个就是这是一个全欧洲阵容的研究人员写的QAI综述,文章的开头就明确了自己的位置,这点还是很耐人寻味的。
arXiv, 2025-05-29T08:15:23Z. DOI: 10.48550/arXiv.2505.23860
Abstract:
This white paper discusses and explores the various points of intersectionbetween quantum computing and artificial intelligence (AI). It describes howquantum computing could support the development of innovative AI solutions. Italso … >>>
This white paper discusses and explores the various points of intersectionbetween quantum computing and artificial intelligence (AI). It describes howquantum computing could support the development of innovative AI solutions. Italso examines use cases of classical AI that can empower research anddevelopment in quantum technologies, with a focus on quantum computing andquantum sensing. The purpose of this white paper is to provide a long-termresearch agenda aimed at addressing foundational questions about how AI andquantum computing interact and benefit one another. It concludes with a set ofrecommendations and challenges, including how to orchestrate the proposedtheoretical work, align quantum AI developments with quantum hardware roadmaps,estimate both classical and quantum resources - especially with the goal ofmitigating and optimizing energy consumption - advance this emerging hybridsoftware engineering discipline, and enhance European industrialcompetitiveness while considering societal implications. <<<
翻译
10.
刘昊辰 (2025-09-08 15:13):
#paper Gemini 2.5 Pro Capable of Winning Gold at IMO 2025 研究团队通过构建自验证流程(含初始解题、自我改进、验证纠错等步骤)并优化提示词设计,利用 Google 的Gemini 2.5 Pro 模型在 2025 年国际数学奥林匹克竞赛(IMO 2025)的 6 道题目中成功解出 5 道,且为避免数据污染仅使用最新发布的 IMO 2025 题目作为测试集;研究还对比了带提示(如数学归纳法、解析几何)与无提示解题的效果,发现提示主要提升效率而非创造新能力,同时指出模型在第 6 题中因错误假设导致解题失败,最终证实强大 LLM 结合合理策略可实现高水平数学推理,接近人类金牌水平。下载地址:https://arxiv.org/pdf/2507.15855
arXiv, 2025-07-21T17:59:49Z. DOI: 10.48550/arXiv.2507.15855
Abstract:
The International Mathematical Olympiad (IMO) poses uniquely challengingproblems requiring deep insight, creativity, and formal reasoning. While LargeLanguage Models (LLMs) perform well on mathematical benchmarks like AIME, theystruggle with Olympiad-level tasks. … >>>
The International Mathematical Olympiad (IMO) poses uniquely challengingproblems requiring deep insight, creativity, and formal reasoning. While LargeLanguage Models (LLMs) perform well on mathematical benchmarks like AIME, theystruggle with Olympiad-level tasks. We use Google's Gemini 2.5 Pro on the newlyreleased IMO 2025 problems, avoiding data contamination. Using aself-verification pipeline with careful prompt design, 5 (out of 6) problemsare solved correctly. This result underscores the importance of developingoptimal strategies to harness the full potential of powerful LLMs for complexreasoning tasks. <<<
翻译
11.
符毓 (2025-08-31 23:27):
#paper doi: 10.48550/arXiv.2507.21046, 2025, A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence. 本综述首次系统而全面地回顾了自演化的智能体,并围绕三个基本维度:演化什么、何时演化以及如何演化进行了梳理。大型语言模型 (LLM) 其本质上仍处于静态,无法调整其内部参数以适应新任务、不断发展的知识领域或动态交互环境。随着 LLM 越来越多地部署在开放式交互式环境中,这种静态特性已成为关键瓶颈。本文研究了跨代理组件(例如模型、内存、工具、架构)的演化机制,按阶段(例如测试内、测试间)对适应方法进行分类,并分析指导演化适应的算法和架构设计(例如标量奖励、文本反馈、单代理和多代理系统)。
arXiv, 2025/8/1.
Abstract:
Large Language Models (LLMs) have demonstrated strong capabilities but remain fundamentally static, unable to adapt their internal parameters to novel tasks, evolving knowledge domains, or dynamic interaction contexts. As LLMs … >>>
Large Language Models (LLMs) have demonstrated strong capabilities but remain fundamentally static, unable to adapt their internal parameters to novel tasks, evolving knowledge domains, or dynamic interaction contexts. As LLMs are increasingly deployed in open-ended, interactive environments, this static nature has become a critical bottleneck, necessitating agents that can adaptively reason, act, and evolve in real time. This paradigm shift -- from scaling static models to developing self-evolving agents -- has sparked growing interest in architectures and methods enabling continual learning and adaptation from data, interactions, and experiences. This survey provides the first systematic and comprehensive review of self-evolving agents, organized around three foundational dimensions -- what to evolve, when to evolve, and how to evolve. We examine evolutionary mechanisms across agent components (e.g., models, memory, tools, architecture), categorize adaptation methods by stages (e.g., intra-test-time, inter-test-time), and analyze the algorithmic and architectural designs that guide evolutionary adaptation (e.g., scalar rewards, textual feedback, single-agent and multi-agent systems). Additionally, we analyze evaluation metrics and benchmarks tailored for self-evolving agents, highlight applications in domains such as coding, education, and healthcare, and identify critical challenges and research directions in safety, scalability, and co-evolutionary dynamics. By providing a structured framework for understanding and designing self-evolving agents, this survey establishes a roadmap for advancing adaptive agentic systems in both research and real-world deployments, ultimately shedding lights to pave the way for the realization of Artificial Super Intelligence (ASI), where agents evolve autonomously, performing at or beyond human-level intelligence across a wide array of tasks. <<<
翻译
12.
尹志 (2025-08-31 12:56):
#paper doi:10.48550/arXiv.2505.13683, ISCA, 2025, Genesis: A Compiler Framework for Hamiltonian Simulation on Hybrid CV-DV Quantum Computers. 作者引入了第一个基于连续离散混合量子计算系统的针对哈密顿量模拟的量子编译框架,非常有意思的工作。该框架分为哈密顿量初步分解和进一步的mapping和routing。也在几个常见的 物理模型上做了验证。量子编译作为量子计算机的一个重要环节,值得更多关注和技术的突破。
arXiv, 2025-05-19T19:32:06Z. DOI: 10.48550/arXiv.2505.13683
Abstract:
This paper introduces Genesis, the first compiler designed to supportHamiltonian Simulation on hybrid continuous-variable (CV) and discrete-variable(DV) quantum computing systems. Genesis is a two-level compilation system. Atthe first level, it … >>>
This paper introduces Genesis, the first compiler designed to supportHamiltonian Simulation on hybrid continuous-variable (CV) and discrete-variable(DV) quantum computing systems. Genesis is a two-level compilation system. Atthe first level, it decomposes an input Hamiltonian into basis gates using thenative instruction set of the target hybrid CV-DV quantum computer. At thesecond level, it tackles the mapping and routing of qumodes/qubits to implementlong-range interactions for the gates decomposed from the first level. Ratherthan a typical implementation that relies on SWAP primitives similar toqubit-based (or DV-only) systems, we propose an integrated design ofconnectivity-aware gate synthesis and beamsplitter SWAP insertion tailored forhybrid CV-DV systems. We also introduce an OpenQASM-like domain-specificlanguage (DSL) named CVDV-QASM to represent Hamiltonian in terms ofPauli-exponentials and basic gate sequences from the hybrid CV-DV gate set.Genesis has successfully compiled several important Hamiltonians, including theBose-Hubbard model, $\mathbb{Z}_2-$Higgs model, Hubbard-Holstein model,Heisenberg model and Electron-vibration coupling Hamiltonians, which arecritical in domains like quantum field theory, condensed matter physics, andquantum chemistry. Our implementation is available atGenesis-CVDV-Compiler(https://github.com/ruadapt/Genesis-CVDV-Compiler). <<<
翻译
13.
刘昊辰 (2025-08-19 13:25):
#paper Search-contempt a hybrid MCTS algorithm for training AlphaZero-like engines with better computational efficiency提出search-contempt,一种结合PUCT与Thompson Sampling(TS) 的混合 MCTS 算法,通过新参数Nscl调控自对弈中生成的棋局分布,偏好 “挑战性” 局面。在常规国际象棋中,其生成的训练棋局质量更高,使引擎强度提升约70 Elo,且训练所需棋局数量从数千万减少至数十万,计算成本从数千万美元降至数万美元;在Odds Chess(一方开局劣势)中,强度提升约150 Elo,同时增强系统对抗鲁棒性,有望在消费级 GPU 上实现从零训练。下载地址:https://arxiv.org/pdf/2504.07757
14.
尹志 (2025-07-31 23:59):
#paper doi: 10.48550/arXiv.2507.06216 Unitary designs in nearly optimal depth. 文章设计了一种全新的量子电路,该电路可以接近理论最优深度高效构建unitray k-designs. 如果这个方案足够有效,那么对后续的量子算法的设计无疑非常有帮助。
arXiv, 2025-07-08T17:48:33Z. DOI: 10.48550/arXiv.2507.06216
Abstract:
We construct $\varepsilon$-approximate unitary $k$-designs on $n$ qubits incircuit depth $O(\log k \log \log n k / \varepsilon)$. The depth isexponentially improved over all known results in all three parameters … >>>
We construct $\varepsilon$-approximate unitary $k$-designs on $n$ qubits incircuit depth $O(\log k \log \log n k / \varepsilon)$. The depth isexponentially improved over all known results in all three parameters $n$, $k$,$\varepsilon$. We further show that each dependence is optimal up toexponentially smaller factors. Our construction uses $\tilde{{O}}(nk)$ ancillaqubits and ${O}(nk)$ bits of randomness, which are also optimal up to $\log(nk)$ factors. An alternative construction achieves a smaller ancilla count$\tilde{{O}}(n)$ with circuit depth ${O}(k \log \log nk/\varepsilon)$. Toachieve these efficient unitary designs, we introduce a highly-structuredrandom unitary ensemble that leverages long-range two-qubit gates and low-depthimplementations of random classical hash functions. We also develop a newanalytical framework for bounding errors in quantum experiments involving manyqueries to random unitaries. As an illustration of this framework'sversatility, we provide a succinct alternative proof of the existence ofpseudorandom unitaries. <<<
翻译
15.
林海onrush (2025-07-31 23:19):
#paper, 《Efficient Qudit Circuit for Quench Dynamics of 2+1D Quantum Link Electrodynamics》,10.48550/arXiv.2507.12589 , 本研究提出了一种基于多能级量子比特(qudit)的高效量子电路框架,用于模拟2+1维U(1)格点规范电动力学的淬灭动力学。通过利用高斯定律积分出物质场,仅保留规范自由度,作者构建了无需辅助qubit的紧凑电路设计,并通过数值模拟验证其在现实噪声下仍能保持高度相干的动态演化表现。 该方法不仅大幅降低了量子资源消耗,还适用于任意自旋表示和更高维度格点系统,具备良好的可扩展性。相比传统qubit编码,qudit实现更贴近硬件特性,适用于当前和近期的量子处理器,为模拟高能物理非平衡现象提供了一条切实可行的量子计算路径。
arXiv, 2025-07-16T19:16:49Z. DOI: 10.48550/arXiv.2507.12589
Abstract:
A major challenge in the burgeoning field of quantum simulation forhigh-energy physics is the realization of scalable $2+1$D lattice gaugetheories on state-of-the-art quantum hardware, which is an essential steptowards the … >>>
A major challenge in the burgeoning field of quantum simulation forhigh-energy physics is the realization of scalable $2+1$D lattice gaugetheories on state-of-the-art quantum hardware, which is an essential steptowards the overarching goal of probing $3+1$D quantum chromodynamics on aquantum computer. Despite great progress, current experimental implementationsof $2+1$D lattice gauge theories are mostly restricted to relatively smallsystem sizes and two-level representations of the gauge and electric fields.Here, we propose a resource-efficient method for quantum simulating $2+1$Dspin-$S$ $\mathrm{U}(1)$ quantum link lattice gauge theories with dynamicalmatter using qudit-based quantum processors. By integrating out the matterfields through Gauss's law, we reformulate the quantum link model in a purelyspin picture compatible with qudit encoding across arbitrary spatialdimensions, eliminating the need for ancillary qubits and reducing resourceoverhead. Focusing first on the spin-$1/2$ case, we construct explicit circuitsfor the full Hamiltonian and demonstrate through numerical simulations that thefirst-order Trotterized circuits accurately capture the quench dynamics even inthe presence of realistic noise levels. Additionally, we introduce a generalmethod for constructing coupling-term circuits for higher-spin representations$S>1/2$. Compared to conventional qubit encodings, our framework significantlyreduces the number of quantum resources and gate count. Our approachsignificantly enhances scalability and fidelity for probing nonequilibriumphenomena in higher-dimensional lattice gauge theories, and is readily amenableto implementation on state-of-the-art qudit platforms. <<<
翻译
16.
刘昊辰 (2025-07-09 14:59):
#paper Rapfi Distilling Efficient Neural Network for the Game of Gomoku. 本文提出 Rapfi,一种高效的五子棋智能体,在有限计算环境中表现优于基于 CNN 的智能体。Rapfi 利用从 CNN 提炼的基于模式的码本压缩神经网络,以及在输入变化较小时最小化计算的增量更新方案。这种新网络使用数量级更少的计算量,达到与 ResNet 等更大神经网络相似的精度。得益于增量更新方案,深度优先搜索方法(如 α-β 搜索)可以显著加速。通过精心调整评估和搜索,Rapfi 在缺乏 GPU 等加速器的有限计算资源下,实力超越了基于 AlphaZero 算法的最强开源五子棋 AI Katagomo。Rapfi 在 Botzone 的 520 个五子棋智能体中排名第一,并在 2024 年 GomoCup 中夺冠。下载地址:https://arxiv.org/pdf/2503.13178
arXiv, 2025-03-17T13:53:57Z. DOI: 10.48550/arXiv.2503.13178
Abstract:
Games have played a pivotal role in advancing artificial intelligence, withAI agents using sophisticated techniques to compete. Despite the success ofneural network based game AIs, their performance often requires significantcomputational … >>>
Games have played a pivotal role in advancing artificial intelligence, withAI agents using sophisticated techniques to compete. Despite the success ofneural network based game AIs, their performance often requires significantcomputational resources. In this paper, we present Rapfi, an efficient Gomokuagent that outperforms CNN-based agents in limited computation environments.Rapfi leverages a compact neural network with a pattern-based codebookdistilled from CNNs, and an incremental update scheme that minimizescomputation when input changes are minor. This new network uses computationthat is orders of magnitude less to reach a similar accuracy of much largerneural networks such as Resnet. Thanks to our incremental update scheme,depth-first search methods such as the alpha-beta search can be significantlyaccelerated. With a carefully tuned evaluation and search, Rapfi reachedstrength surpassing Katagomo, the strongest open-source Gomoku AI based onAlphaZero's algorithm, under limited computational resources where acceleratorslike GPUs are absent. Rapfi ranked first among 520 Gomoku agents on Botzone andwon the championship in GomoCup 2024. <<<
翻译
17.
尹志 (2025-06-30 23:17):
#paper arXiv:2411.09131;Artificial Intelligence for Quantum Computing;2024;Yuri大佬带领的一篇综述,把AI用于量子计算的几个方面都做了分析和展望,虽然不是特别细致,但如果你希望量子计算能更快做出实际问题的优越性,显然不应该错过这篇综述。
arXiv, 2024-11-14T02:11:16Z. DOI: 10.48550/arXiv.2411.09131
Abstract:
Artificial intelligence (AI) advancements over the past few years have had anunprecedented and revolutionary impact across everyday application areas. Itssignificance also extends to technical challenges within science andengineering, including the … >>>
Artificial intelligence (AI) advancements over the past few years have had anunprecedented and revolutionary impact across everyday application areas. Itssignificance also extends to technical challenges within science andengineering, including the nascent field of quantum computing (QC). Thecounterintuitive nature and high-dimensional mathematics of QC make it a primecandidate for AI's data-driven learning capabilities, and in fact, many of QC'sbiggest scaling challenges may ultimately rest on developments in AI. However,bringing leading techniques from AI to QC requires drawing on disparateexpertise from arguably two of the most advanced and esoteric areas of computerscience. Here we aim to encourage this cross-pollination by reviewing howstate-of-the-art AI techniques are already advancing challenges across thehardware and software stack needed to develop useful QC - from device design toapplications. We then close by examining its future opportunities and obstaclesin this space. <<<
翻译
18.
刘馨云 (2025-06-30 20:34):
#paper arXiv:2406.10206;Visual Imitation Enables Contextual Humanoid Control;UC Berkeley, 2024;链接:https://videomimic.net VIDEOMIMIC 是一个从现实视频中学习上下文感知技能的类人机器人控制方法。论文提出一种 real-to-sim-to-real 模型训练管线,首次实现在无任务标签、无奖励函数、无 MoCap 情况下,仅通过日常视频即可训练并部署一个能上下楼梯、坐下、起立、越障的通用控制策略。 核心贡献:首次提出从单目日常视频中提取4D人-场景几何信息用于机器人控制学习:同时重建人体运动与环境几何(mesh);使用人体身高先验解决尺度不确定性,生成物理仿真可用的环境与动作数据。设计了多阶段 RL 策略训练管线,实现从视频到通用策略的学习:采用 MoCap 数据预训练;引入高度图作为环境输入,实现地形感知;利用 DAgger 蒸馏去除目标角依赖,训练单一策略统一执行坐起/上下楼等多任务。所学策略仅依赖机器人自身状态与 LiDAR 高度图即可在真实机器人上运行:使用 Unitree G1 部署,实现在室内外多种楼梯、草地、椅子场景下动作;在未知环境中无需任务标签,通过“地形+方向”自然触发相应行为。相较基线方法,VIDEOMIMIC 重建精度与泛化能力大幅提升:
arXiv, 2025-05-06T17:57:12Z. DOI: 10.48550/arXiv.2505.03729
Abstract:
How can we teach humanoids to climb staircases and sit on chairs using thesurrounding environment context? Arguably, the simplest way is to just showthem-casually capture a human motion video and … >>>
How can we teach humanoids to climb staircases and sit on chairs using thesurrounding environment context? Arguably, the simplest way is to just showthem-casually capture a human motion video and feed it to humanoids. Weintroduce VIDEOMIMIC, a real-to-sim-to-real pipeline that mines everydayvideos, jointly reconstructs the humans and the environment, and produceswhole-body control policies for humanoid robots that perform the correspondingskills. We demonstrate the results of our pipeline on real humanoid robots,showing robust, repeatable contextual control such as staircase ascents anddescents, sitting and standing from chairs and benches, as well as otherdynamic whole-body skills-all from a single policy, conditioned on theenvironment and global root commands. VIDEOMIMIC offers a scalable path towardsteaching humanoids to operate in diverse real-world environments. <<<
翻译
19.
林海onrush (2025-06-07 13:27):
#paper, Token-Importance Guided Direct Preference Optimization,DOI: https://arxiv.org/abs/2505.19653, share一下个人最新的大模型微调算法工作,我们针对大语言模型(LLMs)如何更好地对齐人类偏好提出了一种新方法——TI-DPO。以往常用的DPO(直接偏好优化)方法虽然省去了显式奖励模型,直接用人类偏好数据优化模型,但忽略了不同token(词/字)在生成内容中的重要性差异,这可能导致模型在关键token上犯错,从而产生不符合人类价值观的输出。 TI-DPO通过两大创新点解决了这一问题: 1. 在token level层面引入基于梯度归因的Token重要性权重,能动态识别和优先优化对人类偏好最关键的token; 2. 加入基于对比学习的Triplet(三元组)损失,不仅区分“好-坏”样本,还引入“中间”输出,使优化更细致,有助于模型生成更接近人类期望、远离不理想响应的内容。 实验表明,TI-DPO在多个任务上(如TruthfulQA、IFEval等)表现优异,准确率和生成多样性均超过DPO及其他对齐方法。消融实验进一步验证了token-importance机制和triplet loss的必要性和有效性。理论分析还证明了TI-DPO在优化上拥有更严格的损失下界,训练过程更加稳定。TI-DPO通过精细化地关注关键token,并结合三元组对齐结构,有效提升了大模型的对齐能力与输出质量,为人机交互中的AI安全和有用性提供了新的解决方案。
arXiv, 2025-05-26T08:11:24Z. DOI: 10.48550/arXiv.2505.19653
Abstract:
Ensuring that large language models (LLMs) generate outputs aligned withhuman preferences is important for safe and effective AI interactions. WhileDirect Preference Optimization (DPO) employs an implicit reward function tooptimize the … >>>
Ensuring that large language models (LLMs) generate outputs aligned withhuman preferences is important for safe and effective AI interactions. WhileDirect Preference Optimization (DPO) employs an implicit reward function tooptimize the policy model, however, it and its related variants overlook thedifferential importance of individual tokens and are sensitive to judgmentnoise in preference datasets during generation. Although recent methods attemptto assess the important weight of tokens via probability prediction orsimplistic weighting schemes, these evaluation methods are prone to biases andstill cannot fully address these issues. To solve this problem, we propose theToken-Importance Guided Direct Preference Optimization (TI-DPO), whichintroduces two key innovations: the gradient-based token-importance weightsthat dynamically prioritize critical tokens, and a triple loss that explicitlyguides model outputs to approach human-preferred responses and stay away fromnon-preferred responses. Experimental results show that TI-DPO achieves higheraccuracy and stronger generative diversity, providing more stable andcomputationally efficient solutions compared with DPO and other RLHF methods. <<<
翻译
20.
刘昊辰 (2025-06-03 16:34):
#paper AlphaZero-Edu Making AlphaZero Accessible to Everyone. AlphaZero-Edu 是基于 AlphaZero 数学框架开发的轻量化强化学习框架,专为教育场景和五子棋设计,具有模块化架构(解耦蒙特卡洛树搜索、自我对弈训练、策略价值网络)、资源高效训练(单块 NVIDIA RTX 3090 GPU 即可运行)和高度并行自我对弈数据生成(8 进程实现 3.2 倍加速)等特点。其状态特征采用 21 层张量(含当前状态和 20 层历史状态),输出包含策略概率分布和价值评估标量,并通过旋转 / 翻转数据增强提升泛化能力。训练中结合循环学习率调度器,使策略损失和价值损失均收敛,且在与 4 名人类玩家的对战中实现最高 100% 胜率,最低 60% 胜率(含 20% 平局)。该框架已开源,为学术研究和工业应用提供了可访问的基准。下载地址:https://arxiv.org/pdf/2504.14636
arXiv, 2025-04-20T14:29:39Z. DOI: 10.48550/arXiv.2504.14636
Abstract:
Recent years have witnessed significant progress in reinforcement learning,especially with Zero-like paradigms, which have greatly boosted thegeneralization and reasoning abilities of large-scale language models.Nevertheless, existing frameworks are often plagued by … >>>
Recent years have witnessed significant progress in reinforcement learning,especially with Zero-like paradigms, which have greatly boosted thegeneralization and reasoning abilities of large-scale language models.Nevertheless, existing frameworks are often plagued by high implementationcomplexity and poor reproducibility. To tackle these challenges, we presentAlphaZero-Edu, a lightweight, education-focused implementation built upon themathematical framework of AlphaZero. It boasts a modular architecture thatdisentangles key components, enabling transparent visualization of thealgorithmic processes. Additionally, it is optimized for resource-efficienttraining on a single NVIDIA RTX 3090 GPU and features highly parallelizedself-play data generation, achieving a 3.2-fold speedup with 8 processes. InGomoku matches, the framework has demonstrated exceptional performance,achieving a consistently high win rate against human opponents. AlphaZero-Eduhas been open-sourced at https://github.com/StarLight1212/AlphaZero_Edu,providing an accessible and practical benchmark for both academic research andindustrial applications. <<<
翻译
回到顶部