文献收藏与分享平台

林海onrush (2022-08-07 22:47):

#paper arXiv:2207.03530v1 [cs.RO] 7 Jul 2022，VMAS: A Vectorized Multi-Agent Simulator for Collective Robot Learning，https://deepai.org/publication/vmas-a-vectorized-multi-agent-simulator-for-collective-robot-learning 剑桥大学提出多智能体联合强化学习框架VMAS 虽然许多多机器人协调问题可以通过精确的算法得到最佳解决，但解决方案在机器人的数量上往往是不可扩展的。多智能体强化学习（MARL）作为解决这类问题的一个有希望的解决方案，在机器人界越来越受到关注。然而，仍然缺乏能够快速有效地找到大规模集体学习任务解决方案的工具。在这项工作中，介绍了VMAS。VMAS是一个开源的框架，为高效的MARL基准测试而设计。它由一个用PyTorch编写的矢量二维物理引擎和一套12个具有挑战性的多机器人场景组成。其他场景可以通过一个简单的模块化接口来实现。本文展示了矢量化是如何在不增加复杂性的情况下在加速硬件上实现并行仿真的，比较了VMAS和目前的最优框架OpenAI MPE，表明了其速度超过了MPE100倍，同时本文使用VMAS进行了各种基准测试，表明了现有算法存在的挑战。 VMAS 能够在 10 秒内执行 30,000 次并行仿真，速度提高了 100 倍以上。使用 VMAS 的 RLlib 接口，我们使用各种基于近端策略优化 (PPO) 的 MARL 算法对我们的多机器人场景进行基准测试。 VMAS 的场景在最先进的 MARL 算法的正交方法。 VMAS 框架可在以下网址获得并可进行复现：https://github.com/proroklab/VectorizedMultiAgentSimulator

arXiv, 2022. DOI: arXiv:2207.03530

VMAS: A Vectorized Multi-Agent Simulator for Collective Robot Learning

翻译

Matteo Bettini, Ryan Kortvelesy, Jan Blumenkamp, Amanda Prorok

Abstract:

While many multi-robot coordination problems can be solved optimally by exact algorithms, solutions are often not scalable in the number of robots. Multi-Agent Reinforcement Learning (MARL) is gaining increasing attention in the robotics community as a promising solution to tackle such problems. Nevertheless, we still lack the tools that allow us to quickly and efficiently find solutions to large-scale collective learning tasks. In this work, we introduce the Vectorized Multi-Agent Simulator (VMAS). VMAS is an open-source framework designed for efficient MARL benchmarking. It is comprised of a vectorized 2D physics engine written in PyTorch and a set of twelve challenging multi-robot scenarios. Additional scenarios can be implemented through a simple and modular interface. We demonstrate how vectorization enables parallel simulation on accelerated hardware without added complexity. When comparing VMAS to OpenAI MPE, we show how MPE's execution time increases linearly in the number of simulations while VMAS is able to execute 30,000 parallel simulations in under 10s, proving more than 100x faster. Using VMAS's RLlib interface, we benchmark our multi-robot scenarios using various Proximal Policy Optimization (PPO)-based MARL algorithms. VMAS's scenarios prove challenging in orthogonal ways for state-of-the-art MARL algorithms. The VMAS framework is available at this https URL. A video of VMAS scenarios and experiments is available at this https URL}{here}\footnote{\url{this https URL.

翻译