文献收藏与分享平台

Necessary Cookies: Essential for the website to function properly, such as CSRF tokens.
Analytics Cookies: Help us understand how our website is used and how we can improve it.
Personalization Cookies: Enhance your experience by remembering your preferences.

来自用户 Kunji 的文献。

当前共找到 1 篇文献分享。

Kunji (2025-02-28 23:59):

#paper, https://arxiv.org/pdf/2410.05273, HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers, VLA依赖于数十亿参数的VLM，虽然具有强大的泛化能力，但计算成本高、推理速度慢，限制了其在动态任务中的应用。为了解决这些局限性，文章提出了HiRT框架(Hierarchical Robot Transformer framework)，借鉴了人类认知的双过程理论，采用双系统架构和异步操作机制，实现频率与性能之间的平衡。在模拟和真实环境中的实验结果表明，HiRT取得了显著的改进。在静态任务中，控制频率提高了一倍，并实现了相当的成功率。此外，在之前VLA模型难以应对的真实世界动态操作任务中，HiRT将成功率从48%提高到了75%。

arXiv, 2024-09-12T09:18:09Z. DOI: 10.48550/arXiv.2410.05273

HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers

Jianke Zhang, Yanjiang Guo, Xiaoyu Chen, Yen-Jen Wang, Yucheng Hu, Chengming Shi, Jianyu Chen

Abstract:

Large Vision-Language-Action (VLA) models, leveraging powerful pre trained
Vision-Language Models (VLMs) backends, have shown promise in robotic control
due to their impressive generalization ability. However, the success comes at a
cost. Their reliance on VLM backends with billions of parameters leads to high
computational costs and inference latency, limiting the testing scenarios to
mainly quasi-static tasks and hindering performance in dynamic tasks requiring
rapid interactions. To address these limitations, this paper proposes HiRT, a
Hierarchical Robot Transformer f… >>>