来自用户 张德祥 的文献。
当前共找到 43 篇文献分享,本页显示第 1 - 20 篇。
1.
张德祥
(2023-05-16 08:14):
#paper https://doi.org/10.48550/arXiv.2203.11740
我们可以把我们的大脑想象成是地球,地心熔岩的产生如同在海马体的短期记忆的发生,过程是量子的;地表的地震因为势能释放,选出强的短期记忆成为长期记忆存储在不同皮层的记忆印记细胞能被释放。
AI+脑科学+量子力学的结合。我们提出了PNN,但它不仅仅是简单的时间序列模型。
除了突触连接的共享权重,我们提出了新的神经网络包括突触有效范围权重也会进行前向和反向计算。而且很多仿真是RNN无法实现的。
正向和负向记忆的大脑塑性是量子的并产生短期记忆,并且波函数展现出在一段时间表现出指数衰减,在海马体里产生。而指数衰减是因为壁垒,壁垒可能和星形胶质细胞有关。工作记忆的大脑塑性在大脑流动从海马体到不同皮层通过方向导数。强的工作记忆的大脑塑性转变成长期记忆也就是最大的方向导数,而最大的方向导数就是梯度。这样长期记忆是工作记忆的大脑塑性的梯度。短期记忆变成长期记忆的过程,也就是非经典力学变成经典力学的过程。
PNN的仿真符合了6篇正刊、6篇子刊和1篇物理顶刊的脑科学实验和假设。
更多可以参考: https://mp.weixin.qq.com/s/k-KD1KcQo9FiYcQvSypBjQ
arXiv,
2022.
DOI: 10.48550/arXiv.2203.11740
Abstract:
In addition to the shared weights of the synaptic connections, we proposed a new neural network that includes the synaptic effective range weights for both the forward and back propagation. …
>>>
In addition to the shared weights of the synaptic connections, we proposed a new neural network that includes the synaptic effective range weights for both the forward and back propagation. And lots of simulations were used which RNN cannot be achieved. The simulations of PNN fit very well in experiments and hypotheses of 6 papers CNS Journals, 6 papers of CNS family Journals and 1 paper top Physics Journal [14-26]. The brain plasticity in positive or negative memory may be quantum and produce short-term memory, and exhibits an exponential decay in the wave function over a period of time, produced in the hippocampus. And exponential decay occurs due to barriers, and barriers can refer to astrocytes. Brain plasticity in working memory flows through the brain, from the hippocampus to the cortex, through directional derivatives. The strong working memory brain plasticity turns to long-term memory means maximum of directional derivatives, and maximum of directional derivatives is gradient. Thus, long-term memory signifies the gradient of brain plasticity in working memory. The process of short-term memory turns to long-term memory is the process of non-classically turns to classically. Astrocytic cortex memory persistence factor also inhibits local synaptic accumulation, and the model inspires experiments. This could be the process of astrocytes phagocytose synapses is driven by both positive and negative memories of plasticity in the brain. In simulation, it is possible that thicker cortices and more diverse individuals within the brain could have high IQ, but thickest cortices and most diverse individuals may have low IQ in simulation. PSO considers global solution or best previous solution, but also considers relatively good and relatively inferior solution. And PNN modified ResNet to consider memory gradient. The simple PNN only considers astrocytes phagocytosed synapses.
<<<
翻译
2.
张德祥
(2023-04-16 11:20):
#paper https://doi.org/10.48550/arXiv.2302.10051 一种用于理解神经计算算法基础的既定规范方法是从原则计算目 标中导出在线算法, 并评估它们与解剖学和生理学观察的兼容性。
相似性匹配目标已成为成功导出在线算法的起点, 这些算法映射到具有点神经元和 Hebbian/anti‐Hebbian 可塑性的神经网络 (NN)。这些神经网络模型解释了许多解剖学和生理学观察; 然而, 这些目 标的计算能力有限, 并且派生的 NN 无法解释在整个大脑中普遍存在的多隔室神经元结构和非赫布形式的可塑性。在本文中, 我们回顾并统一了相似性匹配方法的最新扩展, 以解决更复杂的目 标, 包括范围广泛的无监督和自 监督学习任务, 这些任务可以表述为广义特征值问题或非负矩阵分解问题。有趣的是, 源自这些目 标的在线算法自 然地映射到具有多隔室神经元和局部非赫布学习规则的神经网络。
因此, 这种相似性匹配方法的统一扩展提供了一个规范框架, 有助于理解整个大脑中发现的多区室神经元结构和非赫布可塑性。
arXiv,
2023.
DOI: 10.48550/arXiv.2302.10051
Abstract:
An established normative approach for understanding the algorithmic basis of neural computation is to derive online algorithms from principled computational objectives and evaluate their compatibility with anatomical and physiological observations. …
>>>
An established normative approach for understanding the algorithmic basis of neural computation is to derive online algorithms from principled computational objectives and evaluate their compatibility with anatomical and physiological observations. Similarity matching objectives have served as successful starting points for deriving online algorithms that map onto neural networks (NNs) with point neurons and Hebbian/anti-Hebbian plasticity. These NN models account for many anatomical and physiological observations; however, the objectives have limited computational power and the derived NNs do not explain multi-compartmental neuronal structures and non-Hebbian forms of plasticity that are prevalent throughout the brain. In this article, we review and unify recent extensions of the similarity matching approach to address more complex objectives, including a broad range of unsupervised and self-supervised learning tasks that can be formulated as generalized eigenvalue problems or nonnegative matrix factorization problems. Interestingly, the online algorithms derived from these objectives naturally map onto NNs with multi-compartmental neurons and local, non-Hebbian learning rules. Therefore, this unified extension of the similarity matching approach provides a normative framework that facilitates understanding the multi-compartmental neuronal structures and non-Hebbian plasticity found throughout the brain.
<<<
翻译
3.
张德祥
(2023-03-20 10:45):
#paper doi: https://doi.org/10.1101/2022.05.17.492325
Inferring Neural Activity Before Plasticity: A Foundation for Learning Beyond Backpropagation
超越GPT需要从更底层的技术改进,BP是深度学习的核心,生物算法比BP更高效,生物算法是超越BP的一个途径,这篇论文给出了很好的解释及后续论文有一些实验及算法,效率已经可以匹配BP,仍然有更多的优点,
更多可以参考 https://mp.weixin.qq.com/s/lPzGvY6oOnwzVgxDr9ePpA
bioRxiv,
2022.
DOI: 10.1101/2022.05.17.492325
Abstract:
AbstractFor both humans and machines, the essence of learning is to pinpoint which components in its information processing pipeline are responsible for an error in its output — a challenge …
>>>
AbstractFor both humans and machines, the essence of learning is to pinpoint which components in its information processing pipeline are responsible for an error in its output — a challenge that is known ascredit assignment. How the brain solves credit assignment is a key question in neuroscience, and also of significant importance for artificial intelligence. It has long been assumed that credit assignment is best solved by backpropagation, which is also the foundation of modern machine learning. However, it has been questioned whether it is possible for the brain to implement backpropagation and learning in the brain may actually be more efficient and effective than backpropagation. Here, we set out a fundamentally different principle on credit assignment, calledprospective configuration. In prospective configuration, the network first infers the pattern of neural activity that should result from learning, and then the synaptic weights are modified to consolidate the change in neural activity. We demonstrate that this distinct mechanism, in contrast to backpropagation, (1) underlies learning in a well-established family of models of cortical circuits, (2) enables learning that is more efficient and effective in many contexts faced by biological organisms, and (3) reproduces surprising patterns of neural activity and behaviour observed in diverse human and animal learning experiments. Our findings establish a new foundation for learning beyond backpropagation, for both understanding biological learning and building artificial intelligence.
<<<
翻译
4.
张德祥
(2023-03-12 09:48):
#paper https://doi.org/10.48550/arXiv.1806.08053 Semantic information, autonomous agency, and nonequilibrium statistical physics
论文尝试通过反事实对语义信息进行定义,通过个体跟环境的物理系,热力学的信息交换来实现,但后续工作不多,和自由能框架有些接近,
arXiv,
2018.
DOI: 10.48550/arXiv.1806.08053
Abstract:
Shannon information theory provides various measures of so-called "syntactic information", which reflect the amount of statistical correlation between systems. In contrast, the concept of "semantic information" refers to those correlations …
>>>
Shannon information theory provides various measures of so-called "syntactic information", which reflect the amount of statistical correlation between systems. In contrast, the concept of "semantic information" refers to those correlations which carry significance or "meaning" for a given system. Semantic information plays an important role in many fields, including biology, cognitive science, and philosophy, and there has been a long-standing interest in formulating a broadly applicable and formal theory of semantic information. In this paper we introduce such a theory. We define semantic information as the syntactic information that a physical system has about its environment which is causally necessary for the system to maintain its own existence. "Causal necessity" is defined in terms of counter-factual interventions which scramble correlations between the system and its environment, while "maintaining existence" is defined in terms of the system's ability to keep itself in a low entropy state. We also use recent results in nonequilibrium statistical physics to analyze semantic information from a thermodynamic point of view. Our framework is grounded in the intrinsic dynamics of a system coupled to an environment, and is applicable to any physical system, living or otherwise. It leads to formal definitions of several concepts that have been intuitively understood to be related to semantic information, including "value of information", "semantic content", and "agency".
<<<
翻译
5.
张德祥
(2023-02-10 20:03):
#paper https://doi.org/10.48550/arXiv.2210.15889 Towards Data-and Knowledge-Driven Artificial Intelligence: A Survey on Neuro-Symbolic Computing 神经符号计算 (NeSy) 追求认知的符号和统计范式的整合,多年来一直是人工智能 (AI) 的活跃研究领域。由于 NeSy 有望调和符号表示的推理和可解释性优势以及神经网络中的稳健学习,它可能会成为下一代 AI 的催化剂。在本文中,我们系统地概述了 NeSy AI 研究的重要和最新进展。首先,我们介绍了这一领域的研究历史,涵盖了早期的工作和基础。我们进一步讨论背景概念并确定 NeSy 发展背后的关键驱动因素。之后,我们根据强调该研究范式的几个主要特征对最近具有里程碑意义的方法进行了分类,包括神经符号整合、知识表示、知识嵌入和功能。然后,我们简要讨论现代 NeSy 方法在几个领域的成功应用。最后,我们确定了未解决的问题以及潜在的未来研究方向。这项调查有望帮助新的研究人员进入这个快速发展的领域,并加速向数据和知识驱动的 AI 迈进。
arXiv,
2022.
DOI: 10.48550/arXiv.2210.15889
Abstract:
Neural-symbolic computing (NeSy), which pursues the integration of the symbolic and statistical paradigms of cognition, has been an active research area of Artificial Intelligence (AI) for many years. As NeSy …
>>>
Neural-symbolic computing (NeSy), which pursues the integration of the symbolic and statistical paradigms of cognition, has been an active research area of Artificial Intelligence (AI) for many years. As NeSy shows promise of reconciling the advantages of reasoning and interpretability of symbolic representation and robust learning in neural networks, it may serve as a catalyst for the next generation of AI. In the present paper, we provide a systematic overview of the important and recent developments of research on NeSy AI. Firstly, we introduce study history of this area, covering early work and foundations. We further discuss background concepts and identify key driving factors behind the development of NeSy. Afterward, we categorize recent landmark approaches along several main characteristics that underline this research paradigm, including neural-symbolic integration, knowledge representation, knowledge embedding, and functionality. Then, we briefly discuss the successful application of modern NeSy approaches in several domains. Finally, we identify the open problems together with potential future research directions. This survey is expected to help new researchers enter this rapidly-developing field and accelerate progress towards data-and knowledge-driven AI.
<<<
翻译
6.
张德祥
(2023-01-06 18:42):
#paper
https://doi.org/10.48550/arXiv.2212.12393
A-NeSI: A Scalable Approximate Method for Probabilistic Neurosymbolic Inference
这篇论文受GFlownet启发,首次在MNIST ADD的训练上达到了 15位数的加法训练,人造算数天才指日可待。结合神经网络和符号计算 。
arXiv,
2022.
DOI: 10.48550/arXiv.2212.12393
Abstract:
We study the problem of combining neural networks with symbolic reasoning. Recently introduced frameworks for Probabilistic Neurosymbolic Learning (PNL), such as DeepProbLog, perform exponential-time exact inference, limiting the scalability of …
>>>
We study the problem of combining neural networks with symbolic reasoning. Recently introduced frameworks for Probabilistic Neurosymbolic Learning (PNL), such as DeepProbLog, perform exponential-time exact inference, limiting the scalability of PNL solutions. We introduce Approximate Neurosymbolic Inference (A-NeSI): a new framework for PNL that uses neural networks for scalable approximate inference. A-NeSI 1) performs approximate inference in polynomial time without changing the semantics of probabilistic logics; 2) is trained using data generated by the background knowledge; 3) can generate symbolic explanations of predictions; and 4) can guarantee the satisfaction of logical constraints at test time, which is vital in safety-critical applications. Our experiments show that A-NeSI is the first end-to-end method to scale the Multi-digit MNISTAdd benchmark to sums of 15 MNIST digits, up from 4 in competing systems. Finally, our experiments show that A-NeSI achieves explainability and safety without a penalty in performance.
<<<
翻译
7.
张德祥
(2023-01-03 19:36):
#paper https://doi.org/10.24963/ijcai.2020/243 NeurASP: Embracing Neural Networks into Answer Set Programming 通过将神经网络输出视为答案集程序中原子事实的概率分布,
NeurASP 提供了一种简单有效的方法来集成子神经网络和符号计算。
推理可 以帮助识别违反语义约束的感知错误,这反过来可以使感知更加稳健。例如,
用于对象检测的神经网络可能会返回一个边界框及其分类“汽车”,但可 能不清楚它是真车还是玩具车。可以通过应用关于与周围物体的关系的推 理和使用常识知识来进行区分。或者当不清楚附着在汽车上的圆形物体是 轮子还是甜甜圈时,推理者可以根据常识得出结论,它更有可能是轮子。
Abstract:
We present NeurASP, a simple extension of answer set programs by embracing neural networks. By treating the neural network output as the probability distribution over atomic facts in answer set …
>>>
We present NeurASP, a simple extension of answer set programs by embracing neural networks. By treating the neural network output as the probability distribution over atomic facts in answer set programs, NeurASP provides a simple and effective way to integrate sub-symbolic and symbolic computation. We demonstrate how NeurASP can make use of a pre-trained neural network in symbolic computation and how it can improve the neural network's perception result by applying symbolic reasoning in answer set programming. Also, NeurASP can make use of ASP rules to train a neural network better so that a neural network not only learns from implicit correlations from the data but also from the explicit complex semantic constraints expressed by the rules.
<<<
翻译
8.
张德祥
(2022-12-14 19:10):
#paper https://doi.org/10.1371/journal.pone.0277199 Structure learning enhances concept formation in synthetic Active Inference agents 结构学习 抽象学习 概念学习是人类认知的高级功能,物体和场景关系在推理中互相影响,现在AI还做不到这样的智能认知,这篇论文超结构学习迈出了第一步,而且是在结构学习下链接行动和感知。世界模型的学习和推理 自由能在学习的时间尺度上,从最快的推理,到慢一点的网络参数学习,再到最慢的睡眠离线模型的结构学习,次论文这三个层次都有介绍,核心是最高级的结构学习。在自由能框架下 结构学习如何自然出现。结构学习可以联系都洞察力,让人恍然大悟的时刻。涉及贝叶斯模型选择推理。
Abstract:
Humans display astonishing skill in learning about the environment in which they operate. They assimilate a rich set of affordances and interrelations among different elements in particular contexts, and form …
>>>
Humans display astonishing skill in learning about the environment in which they operate. They assimilate a rich set of affordances and interrelations among different elements in particular contexts, and form flexible abstractions (i.e., concepts) that can be generalised and leveraged with ease. To capture these abilities, we present a deep hierarchical Active Inference model of goal-directed behaviour, and the accompanying belief update schemes implied by maximising model evidence. Using simulations, we elucidate the potential mechanisms that underlie and influence concept learning in a spatial foraging task. We show that the representations formed-as a result of foraging-reflect environmental structure in a way that is enhanced and nuanced by Bayesian model reduction, a special case of structure learning that typifies learning in the absence of new evidence. Synthetic agents learn associations and form concepts about environmental context and configuration as a result of inferential, parametric learning, and structure learning processes-three processes that can produce a diversity of beliefs and belief structures. Furthermore, the ensuing representations reflect symmetries for environments with identical configurations.
<<<
翻译
9.
张德祥
(2022-11-19 07:13):
#paper https://doi.org/10.1038/nrn2787 The free-energy principle: a unified brain theory?
自由能原理,说明了行动、感知和学习。这篇评论从自由能的角度审视了生物(如神经达尔文主义)和物理(如信息理论和最优控制理论)科学中的一些关键大脑理论。至关重要的是,有一个关键的主题贯穿了这些理论中的每一个--优化。此外,如果我们仔细观察被优化的东西,同样的数量不断出现,即价值(预期奖励,预期效用)或其补充,惊喜(预测错误,预期成本)。这就是在自由能量原理下被优化的数量,这表明几个全局性的大脑理论可能在自由能量框架内被统一起来。
生物系统的生理学几乎可以完全还原为它们的稳态。更确切地说,生物体所处的生理和感觉状态是有限的,这些状态决定了生物体的表型。数学上,这意味着这些(内感受和外感受)感觉状态的概率必须具有低熵;换句话说,系统很有可能处于少数状态中的任何一种,而处于其余状态的可能性很小。熵也是平均自我信息或“惊喜”8(更正式地说,它是一个结果的负对数概率)。在这里,“一条离开水的鱼”将处于一种震惊的状态(无论是情感上还是数学上)。一条经常离开水的鱼会有高熵值。
贝叶斯大脑假说 使用贝叶斯概率理论将感知表述为基于内部或生成模型的建构过程。潜在的想法是,大脑有一个世界模型 ,它试图使用感觉输入来优化它。这个想法与综合分析 和认识论自动化 有关。根据这种观点,大脑是一个 推 理 机 器 , 它 积 极 地 预 测 和 解 释 它 的 感 觉。这个假设的核心是一个可以产生预测的概率模型,根据这个模型,感官样本被测试来更新对其原因的信念。这个生成模型被分解成一个可能性(给定原因的感觉数据的概率)和一个先验(这些原因的先验概率)。然后,感知成为反转可能性模型(从原因到感觉的映射)的过程,以访问给定感觉数据(从感觉到原因的映射)的原因的后验概率。
这种反演与最小化识别密度和后验密度之间的差异以抑制自由能是一样的。事实上,开发自由能模拟是为了通过将其转化为更简单的优化问题来解决精确推理的难题 。这为模型识别和比较提供了一些强大的近似技术(例如,变分贝叶斯或集成学习)。贝叶斯大脑假说伴随着许多有趣的问题, 这些问题可以用自由能原理来解
Abstract:
A free-energy principle has been proposed recently that accounts for action, perception and learning. This Review looks at some key brain theories in the biological (for example, neural Darwinism) and …
>>>
A free-energy principle has been proposed recently that accounts for action, perception and learning. This Review looks at some key brain theories in the biological (for example, neural Darwinism) and physical (for example, information theory and optimal control theory) sciences from the free-energy perspective. Crucially, one key theme runs through each of these theories - optimization. Furthermore, if we look closely at what is optimized, the same quantity keeps emerging, namely value (expected reward, expected utility) or its complement, surprise (prediction error, expected cost). This is the quantity that is optimized under the free-energy principle, which suggests that several global brain theories might be unified within a free-energy framework.
<<<
翻译
10.
张德祥
(2022-11-16 09:16):
#paper https://doi.org/10.48550/arXiv.2206.02063 Active Bayesian Causal Inference :We sequentially design experiments that are maximally informative about our target causal query, collect the corresponding interventional data, and update our beliefs to choose the next experiment;
目前的工作中,我们考虑了更一般的设置,其中我们有兴趣进行因果推理,但没有获得参考因果模型的先验。
在这种情况下,因果发现可以被视为达到目的的手段,而不是主要目标。由于两个原因, 专注于主动学习完整的因果模型以实现随后的因果推理可能是不利的。首先,如果我们只对因果模型的特定方面感兴趣,那么浪费样本来学习完整的因果图是次优的。其次,从少量数据中发现因果关系会带来显著的认知不确定性;
我们提出了主动贝叶斯因果推理(ABCI),这是一个完全贝叶斯框架,用于整合因果发现和推理与实验设计。基本方法是将贝叶斯先验置于选择的因果模型类之上, 并将学习问题作为贝叶斯推理置于模型后验之上。给定未观察的因果模型,我们通过引入目标因果查询来形式化因果推理 ;
我们遵循贝叶斯最优实验设计方法[10,42]然后根据我们当前的信念,在真正的因果模型上选择最能提供我们目标查询信息的可接受的干预。给定观察到的数据,我们然后通过计算因果模型和查询的后验来更新我们的信念,并使用它们来设计下一个实验。
arXiv,
2022.
DOI: 10.48550/arXiv.2206.02063
Abstract:
Causal discovery and causal reasoning are classically treated as separate and consecutive tasks: one first infers the causal graph, and then uses it to estimate causal effects of interventions. However, …
>>>
Causal discovery and causal reasoning are classically treated as separate and consecutive tasks: one first infers the causal graph, and then uses it to estimate causal effects of interventions. However, such a two-stage approach is uneconomical, especially in terms of actively collected interventional data, since the causal query of interest may not require a fully-specified causal model. From a Bayesian perspective, it is also unnatural, since a causal query (e.g., the causal graph or some causal effect) can be viewed as a latent quantity subject to posterior inference -- other unobserved quantities that are not of direct interest (e.g., the full causal model) ought to be marginalized out in this process and contribute to our epistemic uncertainty. In this work, we propose Active Bayesian Causal Inference (ABCI), a fully-Bayesian active learning framework for integrated causal discovery and reasoning, which jointly infers a posterior over causal models and queries of interest. In our approach to ABCI, we focus on the class of causally-sufficient, nonlinear additive noise models, which we model using Gaussian processes. We sequentially design experiments that are maximally informative about our target causal query, collect the corresponding interventional data, and update our beliefs to choose the next experiment. Through simulations, we demonstrate that our approach is more data-efficient than several baselines that only focus on learning the full causal graph. This allows us to accurately learn downstream causal queries from fewer samples while providing well-calibrated uncertainty estimates for the quantities of interest.
<<<
翻译
11.
张德祥
(2022-11-16 08:17):
#paper https://doi.org/10.48550/arXiv.2204.14170
Tractable Uncertainty for Structure Learning
不幸的是,DAGs 的超指数空间使得表示和学习这样的后验概率都极具挑战性。一个重大突破是引入了基于order的表示(Friedman & Koller,2003),其中状态空间被简化为拓扑序的空间,即使这样,任然难于计算。
基于样本的表征对后验的覆盖非常有限,限制了它们所能提供的信息。例如,考虑在给定任意一组所需边的情况下,寻找最可能的图扩展的问题。给定超指数空间,即使是大样本也可能不包含与给定边集一致的单个订单,这使得回答这样的查询是不可能的。 因此需要寻找紧凑的表示。
利用阶模分布中存在的精确的层次条件独立性。这允许OrderSPNs 在相对于其大小的潜在指数级更大的订单集合上表达分布。提供线性时间的Bayesian causal effects因果计算。
arXiv,
2022.
DOI: 10.48550/arXiv.2204.14170
Abstract:
Bayesian structure learning allows one to capture uncertainty over the causal directed acyclic graph (DAG) responsible for generating given data. In this work, we present Tractable Uncertainty for STructure learning …
>>>
Bayesian structure learning allows one to capture uncertainty over the causal directed acyclic graph (DAG) responsible for generating given data. In this work, we present Tractable Uncertainty for STructure learning (TRUST), a framework for approximate posterior inference that relies on probabilistic circuits as the representation of our posterior belief. In contrast to sample-based posterior approximations, our representation can capture a much richer space of DAGs, while also being able to tractably reason about the uncertainty through a range of useful inference queries. We empirically show how probabilistic circuits can be used as an augmented representation for structure learning methods, leading to improvement in both the quality of inferred structures and posterior uncertainty. Experimental results on conditional query answering further demonstrate the practical utility of the representational capacity of TRUST.
<<<
翻译
12.
张德祥
(2022-11-14 14:39):
#paper https://doi.org/10.48550/arXiv.2210.12761 Path integrals, particular kinds, and strange things
FEP 是一个第一原理解释或方法,可以应用于任何“事物”,
以某种方式消除物理学、生物学和心理学之间的界限。
这种应用认可了许多关于感知行为和自组织的规范性解释。
范围从控制论到协同学(敖,2004;阿什比,1979 年;哈肯,1983;凯尔索,2021);
从强化学习到人工好奇心(巴尔托等人,2013;施密德胡伯,1991;萨顿和巴尔托,1981 年;Tsividis 等人,2021 年);
从预测处理到通用计算(Clark,2013bHohwy,2016;赫特,2006);
从模型预测控制到empowerment(Hafner 等人,2020;Klyubin 等人,2005),等等。
文章用统计物理学和信息论的标准结果来解开上面叙述的论点。
arXiv,
None.
DOI: 10.48550/arXiv.2210.12761
Abstract:
This paper describes a path integral formulation of the free energy principle. The ensuing account expresses the paths or trajectories that a particle takes as it evolves over time. The …
>>>
This paper describes a path integral formulation of the free energy principle. The ensuing account expresses the paths or trajectories that a particle takes as it evolves over time. The main results are a method or principle of least action that can be used to emulate the behaviour of particles in open exchange with their external milieu. Particles are defined by a particular partition, in which internal states are individuated from external states by active and sensory blanket states. The variational principle at hand allows one to interpret internal dynamics - of certain kinds of particles - as inferring external states that are hidden behind blanket states. We consider different kinds of particles, and to what extent they can be imbued with an elementary form of inference or sentience. Specifically, we consider the distinction between dissipative and conservative particles, inert and active particles and, finally, ordinary and strange particles. Strange particles (look as if they) infer their own actions, endowing them with apparent autonomy or agency. In short - of the kinds of particles afforded by a particular partition - strange kinds may be apt for describing sentient behaviour.
<<<
翻译
13.
张德祥
(2022-10-18 10:58):
#paper https://doi.org/10.48550/arXiv.2208.10601Deriving time-averaged active inference from control principles 通过观察随时反馈调整规划的理论实现, 假设固定的动作空间和前馈规划,这可能导致非常高维的递归优化问题。这些假设在经验上和计算上都是有问题的。有机体并不是生来就知道[9];他们学习[40]. 噪音[13,32], 不确定[23], 和可变性[47] 在运动控制方面不够完善,因此必须通过在线反馈来稳定运动控制。
随机最优反馈控制需要一个最优性原则,允许在行动步骤之间整合观察。而不是递归优化单独的动作,通过观察随时反馈调整规划序列。
尽管优化了“全局”(不确定)惊奇率(等式),它只需要在情境中规划和调整行为。
泰德帕里和 Ok[55] 1998 年发表了第一个基于模型的 RL 算法,而 Baxter 和 Bartlett[5] 给出了有偏的政策梯度估计量。亚历山大和布朗又花了十年时间[2]以给出平均成本时间差异学习的递归分解。张与罗斯[61] 直到最近,我才首次发表了“深度”强化学习算法(基于函数逼近)对平均成本标准的适应,该标准仍然是无模型的。Jafarnia-Jahromi 等人[26]最近给出了第一个算法 , 用 于 求 解 具 有 已 知 观 测 密 度 和 未 知 动 态 的 无 限 时 域 平 均 代 价 部 分 可 观 测 问 题 。
结论 这结束了主动推理的无限视野、平均惊奇公式的推导。由于我们的公式将行为情节置于情境中,所以尽管优化了“全局”(不确定)惊奇率(等式),它只需要在情境中规划和调整行为(例如,从时间步长 1 到 T)15). 我们认为,这种积极推理公式可以推进基于模型的概率方法,分层反馈控制[40,33].
arXiv,
2022.
DOI: 10.48550/arXiv.2208.10601
Abstract:
Active inference offers a principled account of behavior as minimizing average sensory surprise over time. Applications of active inference to control problems have heretofore tended to focus on finite-horizon or …
>>>
Active inference offers a principled account of behavior as minimizing average sensory surprise over time. Applications of active inference to control problems have heretofore tended to focus on finite-horizon or discounted-surprise problems, despite deriving from the infinite-horizon, average-surprise imperative of the free-energy principle. Here we derive an infinite-horizon, average-surprise formulation of active inference from optimal control principles. Our formulation returns to the roots of active inference in neuroanatomy and neurophysiology, formally reconnecting active inference to optimal feedback control. Our formulation provides a unified objective functional for sensorimotor control and allows for reference states to vary over time.
<<<
翻译
14.
张德祥
(2022-10-17 20:49):
#paper https://doi.org/10.1016/j.biopsycho.2021.108242 Interoception as modeling, allostasis as control
大脑首先要维持身体的正常状态,还要对将要到来的未来需要提前准备,这需要大脑对身体有建模,大脑对外部世界进行建模,对自身身体也有建模,有关于自我身体的模型,并控制及预测未来身体的需求,比如比赛前的预热、深呼吸。包括管理分泌系统,免疫系统,消化系统等。
心理学家用许多术语来指代内部模型,包括interoception,包括记忆,信念,知觉推理,无意识推理,具身模拟,概念和类别,受控幻觉,预测。
大脑正在预测性地调节身体,这是一个运动控制的问题,而不是感知世界的问题。这是一个沿着期望的轨迹调节身体以实现效率的问题
对身体的调控分两方面,一方面如果提升营养供应,另一方面就要提升废物代谢,这种成对的控制几乎出现在全身的各种调节模式中。
具身决策包括所有三种形式的不确定性,这三种形式的不确定性都受制于非稳态调节:关于生理有效的不确定性,关于运动结果的不确定性,以及关于外部世界的不确定性。
文章提出了非稳态路径积分控制(APIC) Allostatic Path-Integral Control (APIC) 模型。APIC 有一个简单的核心思想:就像知觉概念是身体感觉表面的内部模型一样 15,92,14],行动概念也作为潜在行为及其预测结果的内部模型。
Abstract:
The brain regulates the body by anticipating its needs and attempting to meet them before they arise - a process called allostasis. Allostasis requires a model of the changing sensory …
>>>
The brain regulates the body by anticipating its needs and attempting to meet them before they arise - a process called allostasis. Allostasis requires a model of the changing sensory conditions within the body, a process called interoception. In this paper, we examine how interoception may provide performance feedback for allostasis. We suggest studying allostasis in terms of control theory, reviewing control theory's applications to related issues in physiology, motor control, and decision making. We synthesize these by relating them to the important properties of allostatic regulation as a control problem. We then sketch a novel formalism for how the brain might perform allostatic control of the viscera by analogy to skeletomotor control, including a mathematical view on how interoception acts as performance feedback for allostasis. Finally, we suggest ways to test implications of our hypotheses.
<<<
翻译
15.
张德祥
(2022-10-11 09:40):
#paper DOI: https://doi.org/10.1145/3428208 Scaling Exact Inference for Discrete Probabilistic Programs
概率推理的计算挑战是应用概率编程的主要障碍,如何解决?如何利用程序的结构来分解推理,如何解耦分不的结构和参数?如何证明编译的语义正确?
dice 语言使用weighted model counting (WMC)推理,使用weighted Boolean formulas (WBF) 将代码编译为 binary decision diagrams (BDDs) to represent these formulas;
experiments in Section 5 show Dice performing exact inference on a real-world probabilistic program that is 1.9MB large.
由于避免了指数爆炸,dice 编译的大小是线性的,计算是有保证的,编译方法是有数学证明理论的保证。
dice跟之前概率编程很大的不同是,同时支持常规编程语言的结构 if else for等。
一个关键挑战是dice支持任意观察,dice编译程序到两种BDD,一个支持程序的任意执行忽略观察,另一个表示满足观察的所有执行。
dice 开源。
The key insight is to separate the logical representation of the state space of the program from the probabilities
一旦程序被编译成 BDD,Dice 通过 WMC 对原始概率程序进行推理。至关重要的是,它这样做并没有穷尽列举所有的路径或模型。(高效)
通过条件独立进行抽象降低计算复杂度。(独立性,条件独立,局部结构)
补充参考: https://mp.weixin.qq.com/s/Rks2VGLz8G9XS3IGR7xegw
Abstract:
Probabilistic programming languages (PPLs) are an expressive means of representing and reasoning about probabilistic models. The computational challenge of probabilistic inference remains the primary roadblock for applying PPLs in practice. …
>>>
Probabilistic programming languages (PPLs) are an expressive means of representing and reasoning about probabilistic models. The computational challenge of probabilistic inference remains the primary roadblock for applying PPLs in practice. Inference is fundamentally hard, so there is no one-size-fits all solution. In this work, we target scalable inference for an important class of probabilistic programs: those whose probability distributions are discrete . Discrete distributions are common in many fields, including text analysis, network verification, artificial intelligence, and graph analysis, but they prove to be challenging for existing PPLs. We develop a domain-specific probabilistic programming language called Dice that features a new approach to exact discrete probabilistic program inference. Dice exploits program structure in order to factorize inference, enabling us to perform exact inference on probabilistic programs with hundreds of thousands of random variables. Our key technical contribution is a new reduction from discrete probabilistic programs to weighted model counting (WMC). This reduction separates the structure of the distribution from its parameters, enabling logical reasoning tools to exploit that structure for probabilistic inference. We (1) show how to compositionally reduce Dice inference to WMC, (2) prove this compilation correct with respect to a denotational semantics, (3) empirically demonstrate the performance benefits over prior approaches, and (4) analyze the types of structure that allow Dice to scale to large probabilistic programs.
<<<
翻译
16.
张德祥
(2022-09-19 19:40):
#paper https://doi.org/10.48550/arXiv.2206.00426 Semantic Probabilistic Layers for Neuro-Symbolic Learning 论文为结构化输出预测设计了一个预测层,可以嵌入神经网络中,保证预测与标签约束一致,通过建模复杂的相关性和约束,结合了概率推理和逻辑推理。是现在唯一满足六个条件的实现。(概率性,高表达力,保证逻辑约束一致,通用-支持各种约束的形式语言表达,模块化嵌入神经网络端对端训练,高效的线性时间);核心是论文通过带约束的概率线路来实现。应用:路径规划(有障碍物、水路等限制),层级多标签训练等。
arXiv,
2022.
DOI: 10.48550/arXiv.2206.00426
Abstract:
We design a predictive layer for structured-output prediction (SOP) that can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints. Our …
>>>
We design a predictive layer for structured-output prediction (SOP) that can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints. Our Semantic Probabilistic Layer (SPL) can model intricate correlations, and hard constraints, over a structured output space all while being amenable to end-to-end learning via maximum likelihood. SPLs combine exact probabilistic inference with logical reasoning in a clean and modular way, learning complex distributions and restricting their support to solutions of the constraint. As such, they can faithfully, and efficiently, model complex SOP tasks beyond the reach of alternative neuro-symbolic approaches. We empirically demonstrate that SPLs outperform these competitors in terms of accuracy on challenging SOP tasks including hierarchical multi-label classification, pathfinding and preference learning, while retaining perfect constraint satisfaction.
<<<
翻译
17.
张德祥
(2022-09-16 09:57):
#paper DOI:https://doi.org/10.1016/j.ijar.2021.09.012 Strudel: A fast and accurate learner of structured-decomposable probabilistic circuits
Probabilistic circuits (PCs)将概率分布表示为计算图,并添加图结构属性保证推理计算效率。
结构化可分解是一个吸引人的属性。
它能够有效和精确地计算复杂逻辑公式的概率,并可用于在缺失数据的情况下推理某些预测模型的预期输出。
本文提出一种简单、快速、准确的结构化可分解 PCs 学习算法 Strudel: STRUctured-DEcomposable Learner,从数据中直接学习概率计算图网络。
Abstract:
Probabilistic circuits (PCs) represent a probability distribution as a computational graph. Enforcing structural properties on these graphs guarantees that several inference scenarios become tractable. Among these properties, structured decomposability is …
>>>
Probabilistic circuits (PCs) represent a probability distribution as a computational graph. Enforcing structural properties on these graphs guarantees that several inference scenarios become tractable. Among these properties, structured decomposability is a particularly appealing one: it enables the efficient and exact computations of the probability of complex logical formulas, and can be used to reason about the expected output of certain predictive models under missing data. This paper proposes Strudel, a simple, fast and accurate learning algorithm for structured-decomposable PCs. Compared to prior work for learning structured-decomposable PCs, Strudel delivers more accurate single PC models in fewer iterations, and dramatically scales learning when building ensembles of PCs. It achieves this scalability by exploiting another structural property of PCs, called determinism, and by sharing the same computational graph across mixture components. We show these advantages on standard density estimation benchmarks and challenging inference scenarios.
<<<
翻译
18.
张德祥
(2022-09-16 09:36):
#paper
URL: http://starai.cs.ucla.edu/papers/ProbCirc20.pdf
Probabilistic circuits: A unifying framework for tractable probabilistic models
概率模型是现代机器学习(ML)和人工智能(AI)的核心。
事实上,概率论为在不确定性存在的情况下做出决策提供了一个原则性的、几乎普遍采用的机制。例如,在机器学习中,我们假设我们的数据来自未知的概率分布;
许多机器学习任务简化为简单地执行概率推理。类似地,许多形式的基于模型的人工智能寻求直接将支配我们周围世界的机制表示为某种形式的概率分布。
难怪 ML 中的许多注意力都放在从数据中学习分布上。我们将越来越多的表达性概率模型作为密度估计器,这些模型越来越接近产生数据的分布
但是之前模型及深度学习中的模型效率都不高,而且还不准确,我们要开发理论上可靠的模型且推理时间可控。且富有表现力,现在可以用统一的模型Probabilistic Circuits来处理。
Probabilistic Circuits特点:Probabilistic Circuits就是神经网络,而且是分层混合网络模型
2020.
Abstract:
No abstract available.
19.
张德祥
(2022-09-01 22:03):
#paper https://doi.org/10.48550/arXiv.2208.11970 Understanding Diffusion Models: A Unified Perspective ;最近大火的视频生成模型 dall-e 等背后都是diffusion 模型,这篇论文细致的讲解了diffusion模型的来龙去脉,从ELBO 到VAE 到hierarchical VAE 到diffusion 模型,及diffusion模型的三个视角及diffusion模型的局限,整篇论文公式推导清晰易读是了解diffusion模型的好资料。
arXiv,
2022.
DOI: 10.48550/arXiv.2208.11970
Abstract:
Diffusion models have shown incredible capabilities as generative models; indeed, they power the current state-of-the-art models on text-conditioned image generation such as Imagen and DALL-E 2. In this work we …
>>>
Diffusion models have shown incredible capabilities as generative models; indeed, they power the current state-of-the-art models on text-conditioned image generation such as Imagen and DALL-E 2. In this work we review, demystify, and unify the understanding of diffusion models across both variational and score-based perspectives. We first derive Variational Diffusion Models (VDM) as a special case of a Markovian Hierarchical Variational Autoencoder, where three key assumptions enable tractable computation and scalable optimization of the ELBO. We then prove that optimizing a VDM boils down to learning a neural network to predict one of three potential objectives: the original source input from any arbitrary noisification of it, the original source noise from any arbitrarily noisified input, or the score function of a noisified input at any arbitrary noise level. We then dive deeper into what it means to learn the score function, and connect the variational perspective of a diffusion model explicitly with the Score-based Generative Modeling perspective through Tweedie's Formula. Lastly, we cover how to learn a conditional distribution using diffusion models via guidance.
<<<
翻译
20.
张德祥
(2022-08-16 10:19):
#paper DOI https://doi.org/10.1007/s10339-020-00981-9 Do Process-1 simulations generate the epistemic feelings that drive Process-2 decision making? 如何用范畴数学来描述人类过程1过程2的思考方式(快思考慢思考),并且可以用来分析心流的体验(p31),这种分析结合了全局神经工作空间及元认知。形成了一个人类认知的高级架构。慢思考过程2是快思考过程1的一个高阶功能或过程1的上层调度机制(其中包含了人类的高级情感:问题求解过程中的怀疑焦虑挫折兴奋等)。思维问题解决的慢过程由于与外界互动少,主要以内感受为主,所以内部情绪活动较多(即怀疑焦虑兴奋等)。机制底层仍然是贝叶斯的信念、confidence。
Abstract:
We apply previously developed Chu space and Channel Theory methods, focusing on the construction of Cone-Cocone Diagrams (CCCDs), to study the role of epistemic feelings, particularly feelings of confidence, in …
>>>
We apply previously developed Chu space and Channel Theory methods, focusing on the construction of Cone-Cocone Diagrams (CCCDs), to study the role of epistemic feelings, particularly feelings of confidence, in dual process models of problem solving. We specifically consider "Bayesian brain" models of probabilistic inference within a global neuronal workspace architecture. We develop a formal representation of Process-1 problem solving in which a solution is reached if and only if a CCCD is completed. We show that in this representation, Process-2 problem solving can be represented as multiply iterated Process-1 problem solving and has the same formal solution conditions. We then model the generation of explicit, reportable subjective probabilities from implicit, experienced confidence as a simulation-based, reverse engineering process and show that this process can also be modeled as a CCCD construction.
<<<
翻译