来自用户 林海onrush 的文献。
当前共找到 31 篇文献分享,本页显示第 21 - 31 篇。
21.
林海onrush
(2023-03-31 23:17):
#paper, BloombergGPT: A Large Language Model for Finance, doi:10.48550/arXiv.2303.17564, ChatGPT引爆的AI热潮也“烧到了”金融圈,彭博社重磅发布为金融界打造的大型语言模型(LLM)——BloombergGPT。3月30日,根据彭博社最新发布的报告显示,其构建迄今为止最大的特定领域数据集,并训练了专门用于金融领域的LLM,开发了拥有500亿参数的语言模型——BloombergGPT。报告显示,该模型依托彭博社的大量金融数据源,构建了一个3630亿个标签的数据集,支持金融行业内的各类任务。该模型在金融任务上的表现远超过现有模型,且在通用场景上的表现与现有模型也能一较高下。报告指出,从测试来看,BloombergGPT在五项任务中的四项(ConvFinQA,FiQA SA,FPB和Headline)表现最佳,在NER(Named Entity Recognition)中排名第二。因此,BloombergGPT有其优势性。
arXiv,
2023.
DOI: 10.48550/arXiv.2303.17564
Abstract:
The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models …
>>>
The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific dataset yet, augmented with 345 billion tokens from general purpose datasets. We validate BloombergGPT on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage. Our mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks. Additionally, we explain our modeling choices, training process, and evaluation methodology. As a next step, we plan to release training logs (Chronicles) detailing our experience in training BloombergGPT.
<<<
翻译
22.
林海onrush
(2023-02-28 21:45):
#paper,doi: https://doi.org/10.1038/s41586-023-05859-2,Continuous Symmetry Breaking in a Two-dimensional Rydberg Array,,自发对称性破坏是对物质相位及其相关跃迁进行分类的基础。被打破的潜在对称性的性质决定了相的许多定性特性;离散与连续对称性破坏的情况说明了这一点。与离散情况相反,连续对称性的破坏导致无间隙Goldstone模式的出现,例如控制有序相的热力学稳定性.作者利用可编程的里德伯量子模拟器实现了二维偶极XY模型,展示了XY铁磁体和XY反铁磁体相关低温态的绝热制备。在铁磁情况下,表征了长程XY阶的存在。这项工作对XY相互作用的多体物理学做出了贡献,补充了最近利用里德伯-封锁机制实现表现出离散自旋旋转对称性的Ising型相互作用的工作。该文近期被收录于nature,也证实了工作的严谨和创新。
Abstract:
Spontaneous symmetry breaking underlies much of our classification of phases of matter and their associated transitions. The nature of the underlying symmetry being broken determines many of the qualitative properties …
>>>
Spontaneous symmetry breaking underlies much of our classification of phases of matter and their associated transitions. The nature of the underlying symmetry being broken determines many of the qualitative properties of the phase; this is illustrated by the case of discrete versus continuous symmetry breaking. Indeed, in contrast to the discrete case, the breaking of a continuous symmetry leads to the emergence of gapless Goldstone modes controlling, for instance, the thermodynamic stability of the ordered phase. Here, we realize a two-dimensional dipolar XY model that shows a continuous spin-rotational symmetry using a programmable Rydberg quantum simulator. We demonstrate the adiabatic preparation of correlated low-temperature states of both the XY ferromagnet and the XY antiferromagnet. In the ferromagnetic case, we characterize the presence of a long-range XY order, a feature prohibited in the absence of long-range dipolar interaction. Our exploration of the many-body physics of XY interactions complements recent works using the Rydberg-blockade mechanism to realize Ising-type interactions showing discrete spin rotation symmetry.
<<<
翻译
23.
林海onrush
(2023-01-31 22:08):
#paper https://www.nature.com/articles/s42256-022-00569-2,Deep transfer operator learning for partial differential equations under conditional shift,“迁移学习「求解」偏微分方程,条件偏移下PDE的深度迁移算子学习",来自美国布朗大学和约翰斯·霍普金斯大学(JHU)的研究人员提出了一种新的迁移学习框架,用于基于深度算子网络 (DeepONet) 的条件转移下的任务特定学习(偏微分方程中的函数回归)。由于几何域和模型动力学的变化,研究人员展示了该方法在不同条件下涉及非线性偏微分方程的各种迁移学习场景的优势。尽管源域和目标域之间存在相当大的差异,但提出的迁移学习框架能够快速高效地学习异构任务。该研究发布在《Nature Machine Intelligence》上。深度学习已经成功地应用于模拟偏微分方程(PDE)描述的计算成本很高的复杂物理过程,并实现了卓越的性能,从而加速了不确定性量化、风险建模和设计优化等众多任务。但此类模型的预测性能通常受到用于训练的标记数据的可用性的限制。在许多情况下,收集大量且足够的标记数据集在计算上可能很棘手。此外,孤立学习(即为独特但相关的任务训练单个预测模型)可能非常昂贵。为了解决这个瓶颈,可以在称为迁移学习的框架中利用相关领域之间的知识。在这种情况下,来自在具有足够标记数据的特定域(源)上训练的模型的信息可以转移到只有少量训练数据可用的不同但密切相关的域(目标)。由于缺乏针对特定任务的算子(operator)学习和不确定性量化的 TL 方法,在这项工作中,研究人员提出了一个使用神经算子在条件转换下高效 TL 的新框架。
在这项工作中,研究人员采用了更通用的深度神经算子 (DeepONet),它使我们能够充分学习算子,从而对任意新输入和复杂域执行实时预测。重要的是,所提出的迁移学习框架能够在标记数据非常有限的领域中识别 PDE 算子。这项工作的主要贡献可归纳如下:
提出了一种新的框架,用于在深度神经算子的条件转移下迁移学习问题。
所提出的框架可用于快速高效的特定于任务的 PDE 学习和不确定性量化。
利用 RKHS 和条件嵌入算子理论的原理来构建新的混合损失函数并对目标模型进行微调。
所提出框架的优点和局限性通过各种迁移学习问题得到证明,包括由于域几何、模型动力学、材料特性、非线性等变化引起的分布变化。
Abstract:
Transfer learning enables the transfer of knowledge gained while learning to perform one task (source) to a related but different task (target), hence addressing the expense of data acquisition and …
>>>
Transfer learning enables the transfer of knowledge gained while learning to perform one task (source) to a related but different task (target), hence addressing the expense of data acquisition and labelling, potential computational power limitations and dataset distribution mismatches. We propose a new transfer learning framework for task-specific learning (functional regression in partial differential equations) under conditional shift based on the deep operator network (DeepONet). Task-specific operator learning is accomplished by fine-tuning task-specific layers of the target DeepONet using a hybrid loss function that allows for the matching of individual target samples while also preserving the global properties of the conditional distribution of the target data. Inspired by conditional embedding operator theory, we minimize the statistical distance between labelled target data and the surrogate prediction on unlabelled target data by embedding conditional distributions onto a reproducing kernel Hilbert space. We demonstrate the advantages of our approach for various transfer learning scenarios involving nonlinear partial differential equations under diverse conditions due to shifts in the geometric domain and model dynamics. Our transfer learning framework enables fast and efficient learning of heterogeneous tasks despite considerable differences between the source and target domains.
<<<
翻译
24.
林海onrush
(2023-01-27 01:30):
#paper, Twist: Sound Reasoning for Purity and Entanglement in Quantum Programs,DOI:
10.48550/arXiv.2205.02287,作者引入了纯度表达式的概念,以在量子程序中对纠缠状态进行推理判断。类似于经典内存的指针,并通过执行被称为门的操作来对它们进行评估。由于纠缠的特殊形式存在,导致量子比特的测量结果是相关的现象,而纠缠可以决定算法的正确性和编程模式的适用性。将纯度表达形式化,可以作为自动推理量子程序中纠缠的核心工具,是指其评价不受量子比特的测量结果影响的表达式。本文主要贡献在于提出了Twist,这是第一种具有类型系统的语言,用于对纯度进行合理推理,使开发者能够使用类型注解来识别纯度表达式。最后证明了Twist可以表达量子算法,捕捉其中的编程错误,并支持一些其他语言不允许的程序。同时产生的运行时验证开销小于3.5%。整体而言,是一项基础且有意义的工作。
arXiv,
2022.
DOI: 10.48550/arXiv.2205.02287
Abstract:
Quantum programming languages enable developers to implement algorithms for quantum computers that promise computational breakthroughs in classically intractable tasks. Programming quantum computers requires awareness of entanglement, the phenomenon in which …
>>>
Quantum programming languages enable developers to implement algorithms for quantum computers that promise computational breakthroughs in classically intractable tasks. Programming quantum computers requires awareness of entanglement, the phenomenon in which measurement outcomes of qubits are correlated. Entanglement can determine the correctness of algorithms and suitability of programming patterns. In this work, we formalize purity as a central tool for automating reasoning about entanglement in quantum programs. A pure expression is one whose evaluation is unaffected by the measurement outcomes of qubits that it does not own, implying freedom from entanglement with any other expression in the computation. We present Twist, the first language that features a type system for sound reasoning about purity. The type system enables the developer to identify pure expressions using type annotations. Twist also features purity assertion operators that state the absence of entanglement in the output of quantum gates. To soundly check these assertions, Twist uses a combination of static analysis and runtime verification. We evaluate Twist's type system and analyses on a benchmark suite of quantum programs in simulation, demonstrating that Twist can express quantum algorithms, catch programming errors in them, and support programs that several languages disallow, while incurring runtime verification overhead of less than 3.5%.
<<<
翻译
25.
林海onrush
(2022-12-31 23:26):
#paper,A Data-driven Sequential Localization Framework for Big Telco Data,IEEE Transactions on Knowledge and Data Engineering(2021),DOI: 10.1109/TKDE.2019.2961657
通讯基础设施的迅速发展带来了巨大的MR数据的累积。这些数据被移动物体生成,当连接到数据服务时被存储。地图标记或局部化这样的MR数据被认为对通讯和交通网络优化有很大的影响。为了在学习过程中处理数据密集型工作负载,华为诺亚团队使用物化视图以实现高效的在线本地化和轻量级索引技术用于周期性参数调优,以提高效率和可扩展性。真实数据的结果表明,与最先进的解决方案相比,该解决方案将中位数定位误差提高了 58.8%。
重点勾画:文章简要介绍了隐马尔可夫模型(HMM),该模型捕获了两种类型的随机过程之间的联系:未观察到的状态转换过程和由每个未观察到状态的可观察变量组成的观察过程。首先进行了几个实验来验证以下问题:机器学习单点定位模型的有效性,排放和转移概率解决方案的有效性,以及顺序定位系统与最新基线相比的性能。设计实验来展示提出的索引技术的效率,以及参数调整对系统性能的影响。提出了一个数据驱动的框架,用于电信数据的顺序定位,并配备了一套全面的机器学习和数据管理技术。与最新的序列定位方法相比,作者提出的框架在中值误差方面实现了58.8%的改进,使解决方案在准确性和可采用性方面具有优势;提出了有效的数据访问和索引方法,以支持学习过程中涉及的数据密集型计算。
IF:8.900Q1
IEEE Transactions on Knowledge and Data Engineering,
2019.
DOI: 10.1109/TKDE.2019.2961657
Abstract:
The proliferation of telco networks and mobile terminals brings the accumulation of tremendous amounts of measure report(MR) data at a rapid pace. The MR data is generated by mobile objects …
>>>
The proliferation of telco networks and mobile terminals brings the accumulation of tremendous amounts of measure report(MR) data at a rapid pace. The MR data is generated by mobile objects while connecting to data services and is stored in backend data centers. To geo-tag or localize such MR data is believed to have a profound effect on the analytics and optimizations of telco and traffic networks. However, MR records are of noisy and partial observations regarding to mobile objects' geo-locations and hence pose challenges to accurate telco data localization. There have been quite a few attempts. Single-point localization methods map a MR record to a location, but come out with limited accuracies due to the ignorance of spatiotemporal coherence of successive MR records. Recent efforts on sequential localization techniques alleviate this by mapping a sequence of MR records to a trajectory. However, existing solutions are often with assumptions on specific models, e.g., mobility and signal strength distributions, or priori knowledge on topology space, e.g., road networks, limiting the deployment in practice. To this end, we propose a data-driven framework to tackle the challenges in sequential telco localization. We solely use raw MR records and a public third-party GPS dataset for the learning of the correlations between mobile objects' locations and MR records, requiring no model assumptions and priori knowledge. To handle the data-intensive workloads during the learning process, we use materialized views for efficient online localization and light-weighted indexing techniques for periodical parameters tuning, in order to improve the efficiency and scalability. Results on real data show that our solution achieves 58.8 percent improvement in median localization errors compared with state-of-art sequential localization techniques that require hypothesis models and priori knowledge, making our solution superior in terms of effectiveness, efficiency, and employability.
<<<
翻译
26.
林海onrush
(2022-11-30 21:51):
#paper,https://doi.org/10.48550/arXiv.2211.16197,FJMP: Factorized Joint Multi-Agent Motion Prediction over Learned Directed Acyclic Interaction Graphs,该研究针对自动驾驶轨迹预测生成问题,提出了FJMP,一种学习有向无环相互作用图的因子分解多智能体联合运动预测框架.使用未来场景交互动力学作为稀疏有向交互图,边缘表示agent之间的显式交互,修剪图成有向无环图(DAG)并分解联合预测任务,根据 DAG 的部分排序,其中联合未来轨迹使用有向无环图神经网络DAGNN。在INTERACTION和Argoverse2数据集上,证明了FJMP与非因子化相比能得到准确且场景一致的联合轨迹预测。FJMP在交互的多智能体INTERACTION基准测试上取得SOTA。
arXiv,
2022.
DOI: 10.48550/arXiv.2211.16197
Abstract:
Predicting the future motion of road agents is a critical task in an autonomous driving pipeline. In this work, we address the problem of generating a set of scene-level, or …
>>>
Predicting the future motion of road agents is a critical task in an autonomous driving pipeline. In this work, we address the problem of generating a set of scene-level, or joint, future trajectory predictions in multi-agent driving scenarios. To this end, we propose FJMP, a Factorized Joint Motion Prediction framework for multi-agent interactive driving scenarios. FJMP models the future scene interaction dynamics as a sparse directed interaction graph, where edges denote explicit interactions between agents. We then prune the graph into a directed acyclic graph (DAG) and decompose the joint prediction task into a sequence of marginal and conditional predictions according to the partial ordering of the DAG, where joint future trajectories are decoded using a directed acyclic graph neural network (DAGNN). We conduct experiments on the INTERACTION and Argoverse 2 datasets and demonstrate that FJMP produces more accurate and scene-consistent joint trajectory predictions than non-factorized approaches, especially on the most interactive and kinematically interesting agents. FJMP ranks 1st on the multi-agent test leaderboard of the INTERACTION dataset.
<<<
翻译
27.
林海onrush
(2022-10-29 13:58):
#paper,Model Evaluation, Model Selection, and Algorithm
Selection in Machine Learning , url : https://arxiv.org/abs/1811.12808#,
本论文回顾了用于解决模型评估、模型选择和算法选择三项任务的不同技术,并参考理论和实证研究讨
论了每一项技术的主要优势和劣势。进而,给出建议以促进机器学习研究与应用方面的最佳实践。
详细论文解析见下面pdf
arXiv,
2018.
DOI: 10.48550/arXiv.1811.12808
Abstract:
The correct use of model evaluation, model selection, and algorithm selection techniques is vital in academic machine learning research as well as in many industrial settings. This article reviews different …
>>>
The correct use of model evaluation, model selection, and algorithm selection techniques is vital in academic machine learning research as well as in many industrial settings. This article reviews different techniques that can be used for each of these three subtasks and discusses the main advantages and disadvantages of each technique with references to theoretical and empirical studies. Further, recommendations are given to encourage best yet feasible practices in research and applications of machine learning. Common methods such as the holdout method for model evaluation and selection are covered, which are not recommended when working with small datasets. Different flavors of the bootstrap technique are introduced for estimating the uncertainty of performance estimates, as an alternative to confidence intervals via normal approximation if bootstrapping is computationally feasible. Common cross-validation techniques such as leave-one-out cross-validation and k-fold cross-validation are reviewed, the bias-variance trade-off for choosing k is discussed, and practical tips for the optimal choice of k are given based on empirical evidence. Different statistical tests for algorithm comparisons are presented, and strategies for dealing with multiple comparisons such as omnibus tests and multiple-comparison corrections are discussed. Finally, alternative methods for algorithm selection, such as the combined F-test 5x2 cross-validation and nested cross-validation, are recommended for comparing machine learning algorithms when datasets are small.
<<<
翻译
28.
林海onrush
(2022-10-29 13:51):
#paper,Formal Algorithms for Transformers,url:https://arxiv.org/pdf/2207.09238.pdf,在过去5年多的时间里,Transfermers在多个领域表现出惊人的效果。但是,对于Transformers算法的描述基本都集中在使用图形、文字描述、或针对优化部分的解释,并没有一篇论文给出一个较为完整的Algorithm伪代码。deepmind官方给出了形式化算法伪代码,论文详解见下面PDF
arXiv,
2022.
DOI: 10.48550/arXiv.2207.09238
Abstract:
This document aims to be a self-contained, mathematically precise overview of transformer architectures and algorithms (*not* results). It covers what transformers are, how they are trained, what they are used …
>>>
This document aims to be a self-contained, mathematically precise overview of transformer architectures and algorithms (*not* results). It covers what transformers are, how they are trained, what they are used for, their key architectural components, and a preview of the most prominent models. The reader is assumed to be familiar with basic ML terminology and simpler neural network architectures such as MLPs.
<<<
翻译
29.
林海onrush
(2022-10-29 13:25):
#paper,CAUSAL DISCOVERY WITH REINFORCEMENT
LEARNING,论文地址:https://arxiv.org/pdf/1906.04477.pdf,官方视频介绍:https://iclr.cc/virtual_2020/poster_S1g2skStPB.html,
因果研究作为下一个潜在的热点,已经吸引了机器学习/深度学习领域的的广泛关注,因果研究中一个经典的问题是「因果发现」问题——从被动可观测的数据中发现潜在的因果图结构。
此论文是华为诺亚方舟实验室被 ICLR 2020 接收的一篇满分论文。在此论文中,华为诺亚方舟实验室因果研究团队将强化学习应用到打分法的因果发现算法中,通过基于自注意力机制的 encoder-decoder 神经网络模型探索数据之间的关系,结合因果结构的条件,并使用策略梯度的强化学习算法对神经网络参数进行训练,最终得到因果图结构。在学术界常用的一些数据模型中,该方法在中等规模的图上的表现优于其他方法,包括传统的因果发现算法和近期的基于梯度的算法。同时该方法非常灵活,可以和任意的打分函数结合使用。
arXiv,
2019.
DOI: 10.48550/arXiv.1906.04477
Abstract:
Discovering causal structure among a set of variables is a fundamental problem in many empirical sciences. Traditional score-based casual discovery methods rely on various local heuristics to search for a …
>>>
Discovering causal structure among a set of variables is a fundamental problem in many empirical sciences. Traditional score-based casual discovery methods rely on various local heuristics to search for a Directed Acyclic Graph (DAG) according to a predefined score function. While these methods, e.g., greedy equivalence search, may have attractive results with infinite samples and certain model assumptions, they are usually less satisfactory in practice due to finite data and possible violation of assumptions. Motivated by recent advances in neural combinatorial optimization, we propose to use Reinforcement Learning (RL) to search for the DAG with the best scoring. Our encoder-decoder model takes observable data as input and generates graph adjacency matrices that are used to compute rewards. The reward incorporates both the predefined score function and two penalty terms for enforcing acyclicity. In contrast with typical RL applications where the goal is to learn a policy, we use RL as a search strategy and our final output would be the graph, among all graphs generated during training, that achieves the best reward. We conduct experiments on both synthetic and real datasets, and show that the proposed approach not only has an improved search ability but also allows a flexible score function under the acyclicity constraint.
<<<
翻译
30.
林海onrush
(2022-09-30 22:25):
#paper arXiv, 2209.00796 (2022) , Diffusion Models: A Comprehensive Survey of Methods and Applications, Diffusion model在诸多领域都有着优异的表现,并且考虑到不同领域的应用中diffusion model产生了不同的变形,论文系统地介绍了diffusion model的应用研究,其中包含如下领域:计算机视觉,NLP、波形信号处理、多模态建模、分子图建模、时间序列建模、对抗性净化。工作的主要贡献总结如下:新的分类方法:我们对扩散模型和其应用提出了一种新的、系统的分类法。具体将模型分为三类:采样速度增强、最大似然估计增强、数据泛化增强。进一步地,将扩散模型的应用分为七类:计算机视觉,NLP、波形信号处理、多模态建模、分子图建模、时间序列建模、对抗性净化。全面地概述了现代扩散模型及其应用,展示了每种扩散模型的主要改进,和原始模型进行了必要的比较,并总结了相应的论文。扩散模型的基本思想是正向扩散过程来系统地扰动数据中的分布,然后通过学习反向扩散过程恢复数据的分布,这样就了产生一个高度灵活且易于计算的生成模型。
arXiv,
2022.
DOI: 10.48550/arXiv.2209.00796
Abstract:
Diffusion models are a class of deep generative models that have shown impressive results on various tasks with a solid theoretical foundation. Despite demonstrated success than state-of-the-art approaches, diffusion models …
>>>
Diffusion models are a class of deep generative models that have shown impressive results on various tasks with a solid theoretical foundation. Despite demonstrated success than state-of-the-art approaches, diffusion models often entail costly sampling procedures and sub-optimal likelihood estimation. Significant efforts have been made to improve the performance of diffusion models in various aspects. In this article, we present a comprehensive review of existing variants of diffusion models. Specifically, we provide the taxonomy of diffusion models and categorize them into three types: sampling-acceleration enhancement, likelihood-maximization enhancement, and data-generalization enhancement. We also introduce the other generative models (i.e., variational autoencoders, generative adversarial networks, normalizing flow, autoregressive models, and energy-based models) and discuss the connections between diffusion models and these generative models. Then we review the applications of diffusion models, including computer vision, natural language processing, waveform signal processing, multi-modal modeling, molecular graph generation, time series modeling, and adversarial purification. Furthermore, we propose new perspectives pertaining to the development of generative models. Github: this https URL.
<<<
翻译
31.
林海onrush
(2022-08-07 22:47):
#paper arXiv:2207.03530v1 [cs.RO] 7 Jul 2022,VMAS: A Vectorized Multi-Agent Simulator for
Collective Robot Learning,https://deepai.org/publication/vmas-a-vectorized-multi-agent-simulator-for-collective-robot-learning
剑桥大学提出多智能体联合强化学习框架VMAS
虽然许多多机器人协调问题可以通过精确的算法得到最佳解决,但解决方案在机器人的数量上往往是不可扩展的。多智能体强化学习(MARL)作为解决这类问题的一个有希望的解决方案,在机器人界越来越受到关注。然而,仍然缺乏能够快速有效地找到大规模集体学习任务解决方案的工具。在这项工作中,介绍了VMAS。VMAS是一个开源的框架,为高效的MARL基准测试而设计。它由一个用PyTorch编写的矢量二维物理引擎和一套12个具有挑战性的多机器人场景组成。其他场景可以通过一个简单的模块化接口来实现。
本文展示了矢量化是如何在不增加复杂性的情况下在加速硬件上实现并行仿真的,比较了VMAS和目前的最优框架OpenAI MPE,表明了其速度超过了MPE100倍,同时本文使用VMAS进行了各种基准测试,表明了现有算法存在的挑战。
VMAS 能够在 10 秒内执行 30,000 次并行仿真,速度提高了 100 倍以上。使用 VMAS 的 RLlib 接口,我们使用各种基于近端策略优化 (PPO) 的 MARL 算法对我们的多机器人场景进行基准测试。 VMAS 的场景在最先进的 MARL 算法的正交方法。 VMAS 框架可在以下网址获得并可进行复现:https://github.com/proroklab/VectorizedMultiAgentSimulator
arXiv,
2022.
DOI: arXiv:2207.03530
Abstract:
While many multi-robot coordination problems can be solved optimally by exact algorithms, solutions are often not scalable in the number of robots. Multi-Agent Reinforcement Learning (MARL) is gaining increasing attention …
>>>
While many multi-robot coordination problems can be solved optimally by exact algorithms, solutions are often not scalable in the number of robots. Multi-Agent Reinforcement Learning (MARL) is gaining increasing attention in the robotics community as a promising solution to tackle such problems. Nevertheless, we still lack the tools that allow us to quickly and efficiently find solutions to large-scale collective learning tasks. In this work, we introduce the Vectorized Multi-Agent Simulator (VMAS). VMAS is an open-source framework designed for efficient MARL benchmarking. It is comprised of a vectorized 2D physics engine written in PyTorch and a set of twelve challenging multi-robot scenarios. Additional scenarios can be implemented through a simple and modular interface. We demonstrate how vectorization enables parallel simulation on accelerated hardware without added complexity. When comparing VMAS to OpenAI MPE, we show how MPE's execution time increases linearly in the number of simulations while VMAS is able to execute 30,000 parallel simulations in under 10s, proving more than 100x faster. Using VMAS's RLlib interface, we benchmark our multi-robot scenarios using various Proximal Policy Optimization (PPO)-based MARL algorithms. VMAS's scenarios prove challenging in orthogonal ways for state-of-the-art MARL algorithms. The VMAS framework is available at this https URL. A video of VMAS scenarios and experiments is available at this https URL}{here}\footnote{\url{this https URL.
<<<
翻译