来自用户 林海onrush 的文献。
当前共找到 32 篇文献分享,本页显示第 1 - 20 篇。
1.
林海onrush (2025-01-01 00:27):
#paper, doi: https://doi.org/10.48550/arXiv.2305.19229 ,FedDisco: Federated Learning with Discrepancy-Aware Collaboration, AI顶会ICML上的一篇联邦学习文章,这篇论文提出了一种新的联邦学习(Federated Learning, FL)方法,称为 FedDisco,用于解决数据异质性问题,特别是类别分布的差异性。传统联邦学习通常根据客户端数据集的大小分配模型聚合权重,但这种方法无法充分反映客户端数据的类别分布差异,导致全局模型优化性能不足。FedDisco 引入了一种“差异感知”的聚合权重计算方式,将客户端的数据集大小和本地与全局类别分布的差异程度结合起来,通过调整聚合权重优化全局模型。这一方法在保持隐私保护的前提下,提高了通信和计算效率,并通过理论分析证明了其能有效收紧优化误差上界,从而改善全局模型性能。 实验表明,FedDisco 在多种异质性场景和数据集上显著优于现有的联邦学习方法,且其模块化设计可以轻松整合到现有方法中以进一步提升性能。此外,该方法在仅部分客户端参与的场景和文本分类任务中也表现出良好的适用性。FedDisco 的关键优势在于其创新的聚合权重分配策略,能够在低计算和通信开销下,提升联邦学习算法的鲁棒性和泛化能力。
arXiv, 2023-05-30T17:20:51Z. DOI: 10.48550/arXiv.2305.19229
Abstract:
This work considers the category distribution heterogeneity in federatedlearning. This issue is due to biased labeling preferences at multiple clientsand is a typical setting of data heterogeneity. To alleviate this … >>>
This work considers the category distribution heterogeneity in federatedlearning. This issue is due to biased labeling preferences at multiple clientsand is a typical setting of data heterogeneity. To alleviate this issue, mostprevious works consider either regularizing local models or fine-tuning theglobal model, while they ignore the adjustment of aggregation weights andsimply assign weights based on the dataset size. However, based on ourempirical observations and theoretical analysis, we find that the dataset sizeis not optimal and the discrepancy between local and global categorydistributions could be a beneficial and complementary indicator for determiningaggregation weights. We thus propose a novel aggregation method, FederatedLearning with Discrepancy-aware Collaboration (FedDisco), whose aggregationweights not only involve both the dataset size and the discrepancy value, butalso contribute to a tighter theoretical upper bound of the optimization error.FedDisco also promotes privacy-preservation, communication and computationefficiency, as well as modularity. Extensive experiments show that our FedDiscooutperforms several state-of-the-art methods and can be easily incorporatedwith many existing methods to further enhance the performance. Our code will beavailable at https://github.com/MediaBrain-SJTU/FedDisco. <<<
翻译
2.
林海onrush (2024-11-30 22:18):
#paper, 《Three-manifolds with positive Ricci curvature》, doi: 10.4310/jdg/1214436922, 理查德·S·汉密尔顿(Richard S. Hamilton)于1982年发表的论文, 主要研究了三维流形在正Ricci曲率条件下的几何演化规律。通过引入Ricci流这一核心工具,即一种类似热方程的演化方程,证明了若一个紧致三维流形具有严格正的Ricci曲率,则该性质在Ricci流的演化过程中会始终保持,并最终收敛到恒定正曲率的度量。这一结果解决了Bourguignon提出的关于正Ricci曲率流形分类的猜想,并进一步明确了正曲率三维流形的几何结构特性。为此,作者运用了Nash-Moser反演函数定理来处理非严格抛物性演化方程的解,同时结合最大值原理与插值不等式,确保了解的长期存在性和收敛性。 论文的创新在于巧妙地简化了三维情况下的几何分析问题,通过Ricci曲率直接推导完整的曲率张量,大大降低了计算复杂度。文章展示了三维流形中Ricci流的稳定性与长期行为,不仅为流形几何研究提供了重要工具,也为拓扑学领域的经典问题(如Poincaré猜想)提供了新的视角。虽然本文集中于三维流形,但所用方法和理论工具也可能适用于更高维度流形的研究,从而具有广泛的学术意义和应用潜力。 备注(引用维基百科):里奇-哈密顿流,一般称为里奇流(Ricci flow)在微分几何中是指一种固有的几何学流动,它的主要思想是让流形随时间变形,即是让度规张量随时间变化,观察在流形的变形下,里奇曲率是如何变化的,以此来研究整体的拓扑性质。它的核心是里奇-哈密顿流方程,是一个拟线性抛物型方程组。 里奇流以意大利数学家格雷戈里奥·里奇-库尔巴斯托罗的名字命名,由美国数学家理查德·哈密顿于1981年首次引入。这个工具同时被俄罗斯数学家格里戈里·佩雷尔曼用于解决千禧年大奖难题之一的庞加莱猜想。同样的,西蒙·布伦德和理查德·肖恩正是使用它,使微分球面定理完成证明。
3.
林海onrush (2024-10-15 05:09):
#paper, Anyons in an exactly solved model and beyond, https://doi.org/10.1016/j.aop.2005.10.005, 任意子的经典之作、堪称诺奖分量,Alexei Kitaev撰写的经典论文《任意子在一个精确可解模型及其扩展中的表现》探讨了任意子(Anyons)的性质。任意子是一种只能在二维空间中出现的具有特殊统计特性的粒子。研究集中研究了一个基于蜂窝晶格的自旋1/2系统,其最近邻的自旋之间存在XX、YY或ZZ类型的相互作用。通过将该系统简化为一个静态Z2规范场中的自由费米子系统,作者精确地解决了这个模型。 描述了两个主要的物理相位: 阿贝尔任意子(Abelian Anyons):在其中一个具有能隙的相位中,系统中会出现阿贝尔任意子。这些任意子的交换仅会导致相位偏移,表现出简化的编织规则。阿贝尔任意子的激发是稳定的,并表现出分数统计,这是拓扑序的典型特征。 非阿贝尔任意子(Non-Abelian Anyons):在另一个相位中,尽管系统本身无能隙,但当引入磁场时,系统会形成能隙。在这个相位中,激发变为非阿贝尔任意子,其编织规则更加复杂,类似于Ising模型中的共形块。非阿贝尔任意子具有处理量子计算的潜力,因为其量子态可以通过编织操作来操控。 使用的关键数学工具包括: Majorana费米子:论文通过将自旋用Majorana费米子表示,解决了该模型。Majorana费米子是一种实费米子算符,能够将自旋系统转化为可解的二次费米子系统。 陈数(Chern number):论文引入了一个谱陈数ν来表征不同的相位,阿贝尔相位对应ν = 0,而非阿贝尔相位则对应ν = ±1。 同时探讨了边界模、热传导以及任意子的代数理论。Kitaev详细描述了这些准粒子的性质及其在拓扑量子计算中的潜在应用。拓扑有序态被证明可以作为一种稳健的量子记忆和计算平台,因为它们对局部扰动具有良好的保护作用。任意子在拓扑量子计算中的应用潜力巨大,其中量子信息编码在非阿贝尔任意子的量子态中,通过编织这些粒子来实现量子门操作。这种“纯拓扑”方案提供了一种稳健的量子计算方法。
4.
林海onrush (2024-10-01 00:41):
#paper, https://doi.org/10.1038/s41586-024-08032-5, Addendum: A graph placement methodology for fast chip design, 谷歌Deepmind团队更新了Alpha智能体家族,提出用于芯片领域的AlphaChip,这种基于深度强化学习的芯片设计方法,已经在生成高效芯片布局方面表现出超越人类专家的能力。通过预训练,AlphaChip能够随着解决更多的芯片布局问题而变得更快更强。这种方法已应用于谷歌多代Tensor处理单元(TPU)芯片设计中,并且在减少布线长度和提升性能方面显著超越了人类专家的成果。AlphaChip的方法对AI驱动的芯片设计领域产生了广泛而深刻的影响。Deepmind的Alpha系列,基本每次提出,必登Nature,而且几乎霸榜了Nature的主刊封面,可见实力之强。
5.
林海onrush (2024-09-01 00:00):
#paper, DOI: 10.1088/1367-2630/ab8ab1 ,Randomized benchmarking for qudit Clifford gates,这篇论文研究了如何将随机基准测试(RB)技术从传统的量子比特(qubit)扩展到量子系统中的更高维度(qudit),即具有两个以上能级的量子系统。通过引入单位2-设计(U2D),作者开发了适用于qudit Clifford门的随机基准测试协议,并提供了详细的伪代码算法。这种方法允许有效地评估qudit门的平均保真度,进而为更复杂的量子计算系统提供了可靠的性能表征手段。 同时讨论为什么不能简单地将多量子比特的RB结果应用于qudit系统,特别强调qudit Clifford门的独特性及其不同于qubit的特点。研究为未来基于qudit的量子计算提供了实验测试的理论基础,特别是在利用光子实现的qudit量子计算中, 为高维量子系统的基准测试奠定了基础,有助于推动量子计算的扩展和容错能力的发展。
Abstract:
Abstract We introduce unitary-gate randomized benchmarking (URB) for qudit gates by extending single- and multi-qubit URB to single- and multi-qudit gates. Specifically, we develop a qudit URB procedure that exploits … >>>
Abstract We introduce unitary-gate randomized benchmarking (URB) for qudit gates by extending single- and multi-qubit URB to single- and multi-qudit gates. Specifically, we develop a qudit URB procedure that exploits unitary 2-designs. Furthermore, we show that our URB procedure is not simply extracted from the multi-qubit case by equating qudit URB to URB of the symmetric multi-qubit subspace. Our qudit URB is elucidated by using pseudocode, which facilitates incorporating into benchmarking applications. <<<
翻译
<jats:title>摘要</jats:title> <jats:p>我们通过将单量子比特和多量子比特 URB 扩展到单量子比特门和多量子门,为 qudit 门引入了幺正门随机基准测试 (URB)。具体来说,我们开发了一个利用单一 2 设计的 qudit URB 程序。此外,我们表明,我们的 URB 过程不是通过将 qudit URB 等同于对称多量子比特子空间的 URB 来简单地从多量子比特情况中提取的。我们的 qudit URB 是使用伪代码来阐明的,这有助于整合到基准测试应用程序中。</jats:p>
6.
林海onrush (2024-07-31 22:12):
#paper, DOI: 10.1109/LCSYS.2022.3166446, Deep Reinforcement Learning for Market Making Under a Hawkes Process-Based Limit Order Book Model ,这篇论文“Deep Reinforcement Learning for Market Making Under a Hawkes Process-Based Limit Order Book Model”研究了一种基于深度强化学习(DRL)的做市策略。通过使用多元Hawkes过程模拟器训练控制器,解决了在限价订单簿(LOB)下的最优做市问题。研究模型在简化的LOB框架下,考虑了订单到达率对市场做市商控制策略的动态响应,确保了模型的可操作性。DRL策略在收益和风险管理方面表现出色,优于传统的做市基准策略,如Avellaneda-Stoikov模型和其他线性策略。展示了在基于Hawkes过程的LOB模型下使用DRL进行做市的可行性,并取得了优异的实验结果。特别是DRL策略在收益和风险管理方面表现出色,具有更高的均值收益、更有利的夏普比率和较低的库存风险。未来研究可以考虑更复杂的做市模型,或者基于其他类型核函数的Hawkes过程,以及使用对抗性强化学习来提高模型在不确定性条件下的泛化能力和鲁棒性。
7.
林海onrush (2024-07-01 00:00):
#paper, doi/10.5555/3600270.3602181. The Policy-gradient Placement and Generative Routing Neural Networks for Chip Design, 这篇论文提出了一种结合策略梯度强化学习模型和条件生成模型的新方法,用于VLSI芯片设计中的混合尺寸宏单元布置和路由。通过一个纯粹的神经网络管道,该方法能够高效地处理布置和路由任务,不依赖传统的启发式解算器。实验结果表明,该方法在减少布置重叠面积和生成准确路由方面表现优异,具有较高的成本效益。该研究展示了AI驱动的布置和路由方法在电子设计自动化中的潜力。
用于芯片设计的策略梯度放置和生成路由神经网络
Abstract: No abstract available.
8.
林海onrush (2024-06-01 00:00):
#paper, QNLP in Practice: Running Compositional Models of Meaning on a Quantum Computer, https://doi.org/10.1613/jair.1.14329,关于在量子硬件上设计和实现自然语言处理(NLP)模型的研究.量子自然语言处理(QNLP)旨在开发专门为在量子硬件上运行的NLP模型。本文展示了在中等规模噪声量子计算机(NISQ)上运行的首批NLP实验,处理超过100个句子的数据集。研究的目标不是展示量子对经典NLP任务的优势,而是探索在量子硬件上运行NLP模型的过程,并为AI和NLP研究社区提供详细说明。研究发现所有模型在模拟和实际量子硬件运行中均能平稳收敛,且结果符合预期。实验结果还显示了模型的句法敏感度与任务之间的关联,例如,在某些任务中仅需简单的单词检查即可正确分类,而在另一些任务中,句法结构的重要性则更高。研究发现所有模型在模拟和实际量子硬件运行中均能平稳收敛,且结果符合预期。实验结果还显示了模型的句法敏感度与任务之间的关联,例如,在某些任务中仅需简单的单词检查即可正确分类,而在另一些任务中,句法结构的重要性则更高。
Abstract:
Quantum Natural Language Processing (QNLP) deals with the design and implementation of NLP models intended to be run on quantum hardware. In this paper, we present results on the first … >>>
Quantum Natural Language Processing (QNLP) deals with the design and implementation of NLP models intended to be run on quantum hardware. In this paper, we present results on the first NLP experiments conducted on Noisy Intermediate-Scale Quantum (NISQ) computers for datasets of size greater than 100 sentences. Exploiting the formal similarity of the compositional model of meaning by Coecke, Sadrzadeh, and Clark (2010) with quantum theory, we create representations for sentences that have a natural mapping to quantum circuits. We use these representations to implement and successfully train NLP models that solve simple sentence classification tasks on quantum hardware. We conduct quantum simulations that compare the syntax-sensitive model of Coecke et al. with two baselines that use less or no syntax; specifically, we implement the quantum analogues of a “bag-of-words” model, where syntax is not taken into account at all, and of a word-sequence model, where only word order is respected. We demonstrate that all models converge smoothly both in simulations and when run on quantum hardware, and that the results are the expected ones based on the nature of the tasks and the datasets used. Another important goal of this paper is to describe in a way accessible to AI and NLP researchers the main principles, process and challenges of experiments on quantum hardware. Our aim in doing this is to take the first small steps in this unexplored research territory and pave the way for practical Quantum Natural Language Processing. <<<
翻译
9.
林海onrush (2024-04-02 00:39):
#paper, Curriculum Learning and Imitation Learning for Model-free Control on Financial Time-series, doi:https://doi.org/10.48550/arXiv.2311.13326,这篇论文针对金融时间序列的无模型控制问题,提出了一种新颖的解决思路。传统的强化学习方法在这一领域面临训练数据有限且噪声大的挑战。为此,本文探索了将课程学习和模仿学习这两种在机器人领域已有成功应用的范式引入到金融问题中。通过在两个代表性的数据集上的大量实证实验,论文发现课程学习能够显著提升强化学习算法在复杂金融时间序列决策中的表现,优于所有baseline方法。课程学习通过数据增强逐步提高训练任务的难度,体现了 "由易到难" 的学习策略。实验表明,这种适度的数据平滑可以有效降低数据中的噪声,使得强化学习算法更好地捕捉到真实的市场信号。 相比之下,直接应用模仿学习的效果并不理想。进一步的分析表明,这可能是由于模仿学习在去除噪声的同时,也丢失了部分关键的市场信号。从统计学的角度看,模仿学习实现了噪声和信号的分解,但过度的去噪反而损害了策略学习的效果。 本文的理论贡献在于提出了一个信号噪声分解的统计框架,用于解释课程学习和模仿学习在金融时间序列问题上的效果差异。这一框架也为算法的改进提供了新的思路。此外,论文还讨论了一些有待未来进一步探索的方向,包括考察信号噪声分解的非平稳特性,探索其他形式的数据平滑方法,以及将课程学习拓展应用到其他类型的高噪声时间序列学习任务中。
Abstract:
Curriculum learning and imitation learning have been leveraged extensively inthe robotics domain. However, minimal research has been done on leveragingthese ideas on control tasks over highly stochastic time-series data. Here, … >>>
Curriculum learning and imitation learning have been leveraged extensively inthe robotics domain. However, minimal research has been done on leveragingthese ideas on control tasks over highly stochastic time-series data. Here, wetheoretically and empirically explore these approaches in a representativecontrol task over complex time-series data. We implement the fundamental ideasof curriculum learning via data augmentation, while imitation learning isimplemented via policy distillation from an oracle. Our findings reveal thatcurriculum learning should be considered a novel direction in improvingcontrol-task performance over complex time-series. Our ample random-seedout-sample empirics and ablation studies are highly encouraging for curriculumlearning for time-series control. These findings are especially encouraging aswe tune all overlapping hyperparameters on the baseline -- giving an advantageto the baseline. On the other hand, we find that imitation learning should beused with caution. <<<
翻译
10.
林海onrush (2024-03-14 18:48):
#paper, Deep attention fuzzy cognitive maps for interpretable multivariate time series prediction, doi: https://doi.org/10.1016/j.knosys.2023.110700, 尽管时间序列预测被广泛用于估计各行业复杂系统的未来状态,但准确、可解释和可推广的方法在用于进行长期非平稳预测时仍然受到限制。为此,本文提出了深度注意力模糊认知图谱(DAFCM),它由时空模糊认知图谱(STFCM)、长短期记忆(LSTM)神经网络、时间模糊认知图谱(TFCM)和残差结构组成。首先,改进的注意机制用于构建时空模糊认知图,捕捉节点对的空间相关性和各个节点的时间相关性。其次,将通过STFCM更新的节点状态输入到LSTM中,捕捉这些序列的长期趋势,改进时间注意力的 TFCM 应用于时间序列中的非平稳问题。最后,我们将先前节点的状态值添加到 DAFCM 中,并通过线性变换构建残差结构,以防止长期反向传播中的梯度爆炸和梯度消失。通过结合模糊认知图(FCM)的可解释性和深度学习的高预测精度,DAFCM可用于完成多领域的多变量长期非平稳时间序列预测等任务,其有效性通过6个公开验证跨越 9 个基线的数据集。我们将先前节点的状态值添加到 DAFCM 中,并通过线性变换构建残差结构,以防止长期反向传播中的梯度爆炸和梯度消失。通过结合模糊认知图(FCM)的可解释性和深度学习的高预测精度,DAFCM可用于完成多领域的多变量长期非平稳时间序列预测等任务,其有效性通过6个公开验证跨越 9 个基线的数据集。我们将先前节点的状态值添加到 DAFCM 中,并通过线性变换构建残差结构,以防止长期反向传播中的梯度爆炸和梯度消失。通过结合模糊认知图(FCM)的可解释性和深度学习的高预测精度,DAFCM可用于完成多领域的多变量长期非平稳时间序列预测等任务,其有效性通过6个公开验证跨越 9 个基线的数据集。
11.
林海onrush (2024-02-29 23:59):
#paper, DOI: https://doi.org/10.21203/rs.3.rs-1819548/v1 ,Chaotic Bi-LSTM and Attention HLCO Predictor Based Quantum Price Level Fuzzy Logic Trading System, 这篇论文提出了一种基于混沌双向长短期记忆网络(Bi-LSTM)和注意力机制的高低收盘价格(HLCO)预测模型,以及基于量子价格水平(QPL)的模糊逻辑交易系统。通过结合混沌理论、量子金融理论和先进的人工智能技术,该系统旨在解决传统金融指标存在的固定触发边界和延迟问题,提高交易决策的准确性和效率。实验结果表明,该模型在历史数据的回测中表现出色,证明了其在改进投资决策方面的潜力。 个人感言:这篇论文巧妙地将混沌理论和量子金融理论应用于金融市场的预测和交易决策中,展示了人工智能技术在金融领域的创新应用。通过深入分析市场数据的复杂动态,该研究不仅提高了预测的准确性,还为金融交易策略的制定提供了新的视角和方法,具有重要的理论和实际意义。
Abstract:
Abstract There are various indicators i.e. Relative Strength Index (RSI), Moving Average Convergence Divergence (MACD) , Stochastic Oscillator which have advantages in applications to determine not only market movements with … >>>
Abstract There are various indicators i.e. Relative Strength Index (RSI), Moving Average Convergence Divergence (MACD) , Stochastic Oscillator which have advantages in applications to determine not only market movements with buying and selling decisions in Computational Finance, but have significant drawbacks that discrepancies are easy to match against the best trading times due to fixed order-triggering boundaries and delay problems. For example, RSI ’s 70 and 30 overbuy and oversell are fixed boundaries. Orders can only be triggered when RSI’s value exceeds one of the boundaries. Its computation only considers past market situation prompting indicators like RSI to trigger orders with delay. In this paper, we proposed a method to reduce these problems with advanced AI technologies to generate indicators’ buy and sell signals executed in the best trading time. Recurrent Neural Network (RNN) has outstanding performance to learn time-series data automatic with long-time sequences but ordinary RNN units such as Long-Short-Term-Memory(LSTM) are unable to decipher the relationships between time units, so-called context. Hence, researchers have proposed an algorithm based on RNNs’ Attention Mechanism allowing RNNs to learn information such as chaotic attributes and Quantum properties contained in time sequences. Chaos Theory and Quantum Finance Theory (QFT) are also proposed to simulate these two features. One of the well-performed QFT models is Quantum Price Level (QPL) to simulate all possible vibration levels to locate price. The system used in this paper consists of two components - neural network and fuzzy logic. Neural networks are used to predict future data and to solve indicators lagging problem whereas fuzzy logic is used to solve fixed order-triggering boundaries problem. By combining these two core components, the proposed model has obtained remarkable results in backtesting previous data that it is possible for these methods to make better investment decisions when market changes constantly. <<<
翻译
12.
林海onrush (2024-01-31 23:47):
#paper, doi.org/10.1038/s41586-023-06747-5, Solving olympiad geometry without human demonstrations, 此文介绍了一种解决数学奥林匹克竞赛中复杂几何问题的创新方法。论文中提出的AlphaGeometry是一种结合神经语言模型和符号推理引擎的神经符号系统。它能够生成包括定理和证明在内的合成数据,有效克服了此领域训练数据的稀缺性。AlphaGeometry在解决难度较高的奥林匹克级别问题方面表现出色,其性能可与国际数学奥林匹克竞赛(IMO)金牌得主相媲美。它不仅能以人类可读格式合成证明,还发现了一个已知IMO定理的更通用版本。AlphaGeometry在自动定理证明领域取得了重要进展,展示了神经符号系统在解决复杂数学问题方面的潜力,为不依赖人类生成数据的人工智能研究提供了新方向。这一发展对数学和人工智能领域产生深远影响。
IF:50.500Q1 Nature, 2024-Jan. DOI: 10.1038/s41586-023-06747-5 PMID: 38233616
Abstract:
Proving mathematical theorems at the olympiad level represents a notable milestone in human-level automated reasoning, owing to their reputed difficulty among the world's best talents in pre-university mathematics. Current machine-learning … >>>
Proving mathematical theorems at the olympiad level represents a notable milestone in human-level automated reasoning, owing to their reputed difficulty among the world's best talents in pre-university mathematics. Current machine-learning approaches, however, are not applicable to most mathematical domains owing to the high cost of translating human proofs into machine-verifiable format. The problem is even worse for geometry because of its unique translation challenges, resulting in severe scarcity of training data. We propose AlphaGeometry, a theorem prover for Euclidean plane geometry that sidesteps the need for human demonstrations by synthesizing millions of theorems and proofs across different levels of complexity. AlphaGeometry is a neuro-symbolic system that uses a neural language model, trained from scratch on our large-scale synthetic data, to guide a symbolic deduction engine through infinite branching points in challenging problems. On a test set of 30 latest olympiad-level problems, AlphaGeometry solves 25, outperforming the previous best method that only solves ten problems and approaching the performance of an average International Mathematical Olympiad (IMO) gold medallist. Notably, AlphaGeometry produces human-readable proofs, solves all geometry problems in the IMO 2000 and 2015 under human expert evaluation and discovers a generalized version of a translated IMO theorem in 2004. <<<
翻译
13.
林海onrush (2023-12-30 00:06):
#paper,Using sequences of life-events to predict human lives. Nat Comput Sci (2023). Lives,https://doi.org/10.1038/s43588-023-00573-5,大语言模型可以精准算命了吗?是的!发表于Nature Computational Science的论文提出预测人生走向的模型,用与语言结构相似的方式来表示人类生活,将一系列人类行为事件构建为生命序列。该论文提出了一个名为life2vec的深度学习模型,用于预测人类生活轨迹的各种结果,比如早逝风险和个性特质。该模型基于Transformer架构,可以学习表示人生事件序列的稠密向量表示。研究使用了丹麦全国范围内约600万居民近10年的详细劳动力和医疗数据,构建了生活事件序列。L2V模型的Accuracy达到了78.8%(0.788 [0.782, 0.794])。 该模型包含三个组件:嵌入层、编码器和特定任务的解码器。模型首先通过masked language modeling任务和sequence ordering预测任务进行预训练,学习事件表示和序列结构。之后进行微调,通过早逝预测和个性特质预测等下游任务学习整个生活轨迹的向量表示。结果显示,该模型能够准确预测各种不同领域的结果,在早逝预测任务上明显优于当前最先进的方法。 研究同时分析了模型学习的事件表示空间和个体表示空间,发现它们具有明显的结构,能够体现事件之间的语义关联。该研究也证明了Transformer模型和大规模数据集可用于预测和理解个体生活轨迹,为社会科学和医疗健康领域的新研究打开了新的可能性。需要注意的是,该模型目前只用于研究目的,实际应用中有许多伦理考量需要谨慎对待。那么问题来了,还有什么是大模型所不能的呢。
Abstract:
Here we represent human lives in a way that shares structural similarity to language, and we exploit this similarity to adapt natural language processing techniques to examine the evolution and … >>>
Here we represent human lives in a way that shares structural similarity to language, and we exploit this similarity to adapt natural language processing techniques to examine the evolution and predictability of human lives based on detailed event sequences. We do this by drawing on a comprehensive registry dataset, which is available for Denmark across several years, and that includes information about life-events related to health, education, occupation, income, address and working hours, recorded with day-to-day resolution. We create embeddings of life-events in a single vector space, showing that this embedding space is robust and highly structured. Our models allow us to predict diverse outcomes ranging from early mortality to personality nuances, outperforming state-of-the-art models by a wide margin. Using methods for interpreting deep learning models, we probe the algorithm to understand the factors that enable our predictions. Our framework allows researchers to discover potential mechanisms that impact life outcomes as well as the associated possibilities for personalized interventions. <<<
翻译
14.
林海onrush (2023-12-01 00:00):
#paper, https://doi.org/10.1038/s41562-019-0804-2,Quantum reinforcement learning during human decision-making,这篇nature子刊很有意思,探讨了量子强化学习(QRL)在人类决策中的应用。QRL在人类决策中的新颖应用:该研究是首次将QRL应用于人类决策的实证研究。QRL在计算机模拟中表现出色,但此研究首次在人类决策环境中对其进行了特殊测试。研究利用了参与者在执行爱荷华赌博任务时的行为数据和功能性磁共振成像(fMRI)数据,将2个QRL模型与12个已建立的CRL模型进行了对比。研究者开发了两种新的QRL模型:量子叠加状态学习(QSL)和量子叠加状态加持续性(QSPP)。这些模型的表现在某些方面优于最好的CRL模型。这一发现在包括健康个体和吸烟者在内的不同受试者群体中得到了确认,表明这些模型的稳健性和普遍适用性。量子类过程的神经表征:该研究的一个重要创新是确定了表示量子类过程的神经基质。例如,QSPP模型显示了在大脑中如何表征量子距离和转换幅度——QRL的关键概念。这一发现弥合了认知量子模型和神经科学之间的差距,为决策中的量子类过程提供了神经生物学基础。对于决策中的不确定性理解,论文还探讨了决策中不确定性的角色。通过将QSPP模型与CRL模型(VPPDecayTIC)进行比较,突出了大脑如何不同地处理由不稳定的外部环境影响的内部状态不确定性。这一研究方面强调了QRL模型在提供对决策认知过程更细微洞察方面的潜力。​​
IF:21.400Q1 Nature human behaviour, 2020-03. DOI: 10.1038/s41562-019-0804-2 PMID: 31959921
Abstract:
Classical reinforcement learning (CRL) has been widely applied in neuroscience and psychology; however, quantum reinforcement learning (QRL), which shows superior performance in computer simulations, has never been empirically tested on … >>>
Classical reinforcement learning (CRL) has been widely applied in neuroscience and psychology; however, quantum reinforcement learning (QRL), which shows superior performance in computer simulations, has never been empirically tested on human decision-making. Moreover, all current successful quantum models for human cognition lack connections to neuroscience. Here we studied whether QRL can properly explain value-based decision-making. We compared 2 QRL and 12 CRL models by using behavioural and functional magnetic resonance imaging data from healthy and cigarette-smoking subjects performing the Iowa Gambling Task. In all groups, the QRL models performed well when compared with the best CRL models and further revealed the representation of quantum-like internal-state-related variables in the medial frontal gyrus in both healthy subjects and smokers, suggesting that value-based decision-making can be illustrated by QRL at both the behavioural and neural levels. <<<
翻译
15.
林海onrush (2023-10-31 23:55):
#paper,Quantum Brain Dynamics in 2 + 1 dimensions:Non-equilibrium analysis towards memory formations,https://doi.org/10.1016/j.physa.2022.127397,量子脑动力学(英语:Quantum Brain Dynamics,缩写:QBD)是神经系统科学中的一种假说,目的是在量子场论(QFT)的理论框架下解释大脑的功能。本文的研究人员通过数值模拟,描述了在 2 + 1 维量子脑动力学(QBD)中导致对称性破缺的非平衡过程。采用相干电场、偶极矩密度和偶极矩密度时间导数的时间演化方程,以及非相干偶极子和光子的Kadanoff–Baym方程。最终证明了玻色-爱因斯坦分布适用于非相干偶极子和光子的时间演化。在非零初始电场的触发下,系统的偶极子朝同一方向排列。 这些结果可以作为 QBD 中记忆形成的代表。 其实QBD早在1967年就被人们所关注和研究了,Hiroomi Umezawa 等人提出了脑细胞内部和脑细胞之间的长程相干(long range correlation) 的量子脑动力学理论,并展示了通过南部-戈德斯通玻色子(Nambu-Goldstone bosons) 来实现记忆的存储和检索的可能机制。占大脑70% 的水分子有两个电极,其中水分子的电偶极矩形 成了一个称为“皮层场”的量子场,皮层场中的量子被称为“皮层子”(corticon)。皮层场和神经细胞中的生物分子产生的量子相干波相互作用,可以在神经元和神经网络中传播。这种波在传播过程种能从ATP(三磷酸腺苷) 中获得能量,进而控制神经元的离子通道,并控制信号流向神经突触。量子大脑动力学的研究者们认为,意识并非由某种确定的途径所产生。一方面,吉布(Mari Jibu) 和雅苏(Kunio Yasue)认为,在神经网络中,皮层场和生物分子波的能量量子之间的相互作用产生了意识。另一方面,维泰罗(Giuseppe Vitiello)认为,量子脑动力学中的量子状态产生了两极,一极是主观的外部世界的表现,另一极是向外部世界表现开放的自我。QBD是一种新颖的、用于研究大脑的高级功能(如意识和记忆) 的量子框架,描述了在大脑神经活动层面上经典行为涌现的原因和方式。
16.
林海onrush (2023-09-30 23:01):
#paper,A quantum walk control plane for distributed quantum computing in quantum networks,doi: 10.1109/QCE52317.2021.00048.这篇论文介绍了在量子网络中执行分布式量子计算的一种量子行走协议。该协议利用量子行走作为量子控制信号来执行分布式量子操作。研究考虑了一种离散时间 coined 量子行走模型的泛化,该模型考虑了网络图中的量子行走系统与网络节点内的量子寄存器之间的相互作用。该协议在逻辑上捕捉了分布式量子计算,抽象了硬件实现和通过通道传输量子信息。控制信号传输被映射到网络上行走系统的传播,而控制层与量子寄存器之间的交互被嵌入到硬币算子的应用中。论文还展示了如何使用量子行走系统执行分布式CNOT操作,从而证明了该协议在分布式量子计算方面的通用性。此外,论文还将该协议应用于在量子网络中进行纠缠分发的任务。 感受:这篇论文探讨了在量子网络中执行分布式量子计算的新方法,通过量子行走来实现量子控制信号的传输和操作。这种抽象的方法有望在未来的量子计算领域发挥重要作用,为量子网络的发展提供了新的思路。量子计算领域的研究和应用一直是科学家们追求的目标之一,这篇论文的工作有助于推动量子计算技术的进一步发展,为未来的量子通信和计算提供了新的可能性。
17.
林海onrush (2023-09-01 00:00):
#paper,Supervised machine learning classification of psychosis biotypes based on brain structure: findings from the Bipolar-Schizophrenia network for intermediate phenotypes (B-SNIP),https://doi.org/10.1038/s41598-023-38101-0讨论了精神障碍的传统诊断方法与神经生物学关联的不足,并提出了使用基于大脑的生物标志物来捕获精神病结构的方法。研究以基于MRI图像的灰质密度(GMD)作为生物标志物,通过逻辑回归模型将精神病病例与健康对照进行分类。在不同生物型和诊断方案下,研究评估了六个模型的分类准确性,其中B1生物型模型显示了特异性证据,能够有效区分精神病病例和健康对照。基于GMD的B1分类器结果显示,其与病前智力负相关。研究结果表明,基于B-SNIP精神病生物型的方法可能是捕捉精神病神经生物学特征的有前途方法,并可辅助临床诊断。最近个人也在一直思考如何把脑科学神经科学的东西和量子计算结合研究,下来多读一读脑科学相关文献
IF:3.800Q1 Scientific reports, 2023-08-10. DOI: 10.1038/s41598-023-38101-0 PMID: 37563219
Abstract:
Traditional diagnostic formulations of psychotic disorders have low correspondence with underlying disease neurobiology. This has led to a growing interest in using brain-based biomarkers to capture biologically-informed psychosis constructs. Building … >>>
Traditional diagnostic formulations of psychotic disorders have low correspondence with underlying disease neurobiology. This has led to a growing interest in using brain-based biomarkers to capture biologically-informed psychosis constructs. Building upon our prior work on the B-SNIP Psychosis Biotypes, we aimed to examine whether structural MRI (an independent biomarker not used in the Biotype development) can effectively classify the Biotypes. Whole brain voxel-wise grey matter density (GMD) maps from T1-weighted images were used to train and test (using repeated randomized train/test splits) binary L2-penalized logistic regression models to discriminate psychosis cases (n = 557) from healthy controls (CON, n = 251). A total of six models were evaluated across two psychosis categorization schemes: (i) three Biotypes (B1, B2, B3) and (ii) three DSM diagnoses (schizophrenia (SZ), schizoaffective (SAD) and bipolar (BD) disorders). Above-chance classification accuracies were observed in all Biotype (B1 = 0.70, B2 = 0.65, and B3 = 0.56) and diagnosis (SZ = 0.64, SAD = 0.64, and BD = 0.59) models. However, the only model that showed evidence of specificity was B1, i.e., the model was able to discriminate B1 vs. CON and did not misclassify other psychosis cases (B2 or B3) as B1 at rates above nominal chance. The GMD-based classifier evidence for B1 showed a negative association with an estimate of premorbid general intellectual ability, regardless of group membership, i.e. psychosis or CON. Our findings indicate that, complimentary to clinical diagnoses, the B-SNIP Psychosis Biotypes may offer a promising approach to capture specific aspects of psychosis neurobiology. <<<
翻译
18.
林海onrush (2023-08-01 00:03):
#paper,doi.org/10.1016/j.aim.2023.109194,Equivariant algebraic K-theory, G-theory and derived completions,论文研究的是群作用下的代数K理论的补全问题。主要内容可概括如下: 文章主要研究线性代数群作用在方案上的等ivariant代数K理论和G理论。目标是证明一个类似Atiyah-Segal在拓扑K理论中补全定理的结果。衍生补全的技术对此问题非常关键。Thomason在80年代就预测到需要一种同伦类的补全方法。本文使用第一作者2008年提出的衍生补全方法。Robert Thomason在建立与Atiyah-Segal对应的等变代数K理论的完备性定理时,发现了强限制性条件过于严格。他对等变代数G理论的情况提出了一个猜想,即对于线性代数群在概型上的作用,存在一个类似Atiyah和Segal的完备性定理,而不需要他之前证明的强限制性条件,这些条件也出现在原始的Atiyah-Segal定理中。 本文的主要目标是在尽可能广泛的背景下,利用导出完备性技术,对该猜想进行证明,并考虑几个应用。解决方案足够广泛,允许所有线性代数群的作用,无论它们是否连通,并作用于任何有限型域上的准投影概型,无论它们是否正则或投影。因此,可以考虑大类的变体的等变代数G理论,例如所有的齐次概型(由一个齐次环作用的情况)和所有球状概型(由一个约化群作用的情况)。通过限制为分裂齐次概型的作用,还可以考虑对代数空间的作用。此外,通常也不需要将基概型限制为域,但主要是为了简化部分阐述。这使得可以得到广泛的应用,其中一些被简要概述,并计划在将来详细探讨。实际上,我们在续篇中讨论了将结果扩展到等变同伦K理论以及各种Riemann-Roch定理。 通过将结果与先前已知的没有使用导出完备性的结果进行比较,可以看出如果不使用导出完备性,只能得到非常限制性的结果。
Abstract:
In the mid 1980s, while working on establishing completion theorems for equivariant Algebraic K-Theory similar to the well-known Atiyah-Segal completion theorem for equivariant topological K-theory, the late Robert Thomason found … >>>
In the mid 1980s, while working on establishing completion theorems for equivariant Algebraic K-Theory similar to the well-known Atiyah-Segal completion theorem for equivariant topological K-theory, the late Robert Thomason found the strong finiteness conditions that are required in such theorems to be too restrictive. Then he made a conjecture on the existence of a completion theorem in the sense of Atiyah and Segal for equivariant algebraic G-theory, for actions of linear algebraic groups on schemes that holds without any of the strong finiteness conditions that are required in such theorems proven by him, and also appearing in the original Atiyah-Segal theorem. The main goal of the present paper is to provide a proof of this conjecture in as broad a context as possible, making use of the technique of derived completion, and to consider several of the applications. Our solution is broad enough to allow actions by all linear algebraic groups, irrespective of whether they are connected or not, and acting on any quasi-projective scheme of finite type over a field, irrespective of whether they are regular or projective. This allows us therefore to consider the equivariant algebraic G-Theory of large classes of varieties like all toric varieties (for the action of a torus) and all spherical varieties (for the action of a reductive group). Restricting to actions by split tori, we are also able to consider actions on algebraic spaces. Moreover, the restriction that the base scheme be a field is also not required often, but is put in mainly to simplify some of our exposition. These enable us to obtain a wide range of applications, some of which are briefly sketched and which we plan to explore in detail in the future. In fact, we discuss an extension of our results to equivariant homotopy K-theory along with various Riemann-Roch theorems in a sequel. A comparison of our results with previously known results, none of which made use of derived completions, shows that without the use of derived completions one can only obtain results which are indeed very restrictive. <<<
翻译
19.
林海onrush (2023-06-30 23:49):
#paper,doi:10.1017/fms.2015.2,A THEORY OF COMPLEXITY,CONDITION,AND ROUNDOFF,计算复杂性理论作为评判算法的重要标准,研究各种复杂性类的范围问题具有数学和工程意义,作者开发了一个理论的复杂性数值计算,考虑到输入数据的条件,并允许舍入的计算。Shub和Smale在R上的计算 (这又遵循了由Cook、Karp和Levin等人提出的经典、离散、复杂性理论)。特别专注于决策问题的复杂性类,不同版本的P,NP和EXP的多项式和非确定性多项式。及指数时间。作者证明了这些复杂性类之间的一些基本关系,并提供自然NP完全问题。
Abstract:
We develop a theory of complexity for numerical computations that takes into account the condition of the input data and allows for roundoff in the computations. We follow the lines … >>>
We develop a theory of complexity for numerical computations that takes into account the condition of the input data and allows for roundoff in the computations. We follow the lines of the theory developed by Blum, Shub and Smale for computations over $\mathbb{R}$ (which in turn followed those of the classical, discrete, complexity theory as laid down by Cook, Karp, and Levin, among others). In particular, we focus on complexity classes of decision problems and, paramount among them, on appropriate versions of the classes $\mathsf{P}$, $\mathsf{NP}$, and $\mathsf{EXP}$ of polynomial, nondeterministic polynomial, and exponential time, respectively. We prove some basic relationships between these complexity classes, and provide natural NP-complete problems. <<<
翻译
20.
林海onrush (2023-05-31 22:41):
#paper,Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning,DOI: 10.1126/science.add4679,强化学习在军事战略模拟领域的尝试如何?作者团队给出了一个可行的思路:如何使用无模型的多智能体强化学习来掌握战略游戏Stratego。本文提出了DeepNash,一个能够学习玩不完美信息游戏Stratego1从零开始,直至达到人类专家的水平。战略游戏是人工智能尚未掌握的少数标志性棋盘游戏之一。(AI)还没有掌握的少数标志性棋盘游戏之一。这个流行的游戏有一个巨大的游戏树10535个节点,也就是说,比围棋大0175倍。它有它还有一个额外的复杂性,就是需要在不完美的信息下进行决策。在tratego中,决策是在大量没有明显的离散行动的情况下做出的。行动和结果之间没有明显的联系。情节很长,在玩家获胜之前往往有几百步棋,而且战略游戏中的情况不容易被分解为可管理的大小的子问题。由于这些原因,几十年来《策略》一直是人工智能领域的一个巨大挑战,而现有的人工智能方法几乎没有达到业余水平。业余水平的游戏。DeepNash使用了一种游戏理论的、无模型的深度强化学习方法,不需要搜索,它通过自我游戏来学习掌握Stratego。正则化纳什动力学(R-aD)算法是DeepNash的一个关键组成部分,它收敛到一个近似的纳什均衡,通过直接修改基础的多Agent学习动态性。DeepNash击败了Stratego中现有的最先进的人工智能方法。并在Gravon游戏平台上取得了年度(2022年)和历史上前三名的成绩。平台上取得了年度(2022年)和历史上的前三名,与人类专家玩家竞争。本文的工作很有意思,有进一步探索的空间。个人认为此思路在MOBA类游戏中有很强的可拓展性。
Abstract:
We introduce DeepNash, an autonomous agent that plays the imperfect information game Stratego at a human expert level. Stratego is one of the few iconic board games that artificial intelligence … >>>
We introduce DeepNash, an autonomous agent that plays the imperfect information game Stratego at a human expert level. Stratego is one of the few iconic board games that artificial intelligence (AI) has not yet mastered. It is a game characterized by a twin challenge: It requires long-term strategic thinking as in chess, but it also requires dealing with imperfect information as in poker. The technique underpinning DeepNash uses a game-theoretic, model-free deep reinforcement learning method, without search, that learns to master Stratego through self-play from scratch. DeepNash beat existing state-of-the-art AI methods in Stratego and achieved a year-to-date (2022) and all-time top-three ranking on the Gravon games platform, competing with human expert players. <<<
翻译
回到顶部