来自杂志 arXiv 的文献。
当前共找到 125 篇文献分享,本页显示第 81 - 100 篇。
81.
前进
(2022-12-31 11:39):
#paper Liu Y, Chen J, Wei S, et al. On Finite Difference Jacobian Computation in Deformable Image Registration[J]. arXiv preprint arXiv:2212.06060, 2022.
产生微分同胚的空间变换一直是变形图像配准的中心问题。作为一个微分同胚变换,应在任何位置都具有正的雅可比行列式|J|。|J|<0的体素数已被用于测试微分同胚性,也用于测量变换的不规则性。
对于数字变换,|J|通常使用中心差来近似,但是对于即使在体素分辨率级别上也明显不具有差分同胚性的变换,这种策略可以产生正的|J|。为了证明这一点,论文首先研究了|J|的不同有限差分近似的几何意义。为了确定数字图像的微分同胚性,使用任何单独的有限差分逼近|J|是不够的。论文证明对于2D变换,|J|的四个唯一的有限差分近似必须是正的,以确保整个域是可逆的,并且在像素级没有折叠。在3D中,|J|的十个唯一的有限差分近似值需要是正的。论文提出的数字微分同胚准则解决了|J|的中心差分近似中固有的几个误差,并准确地检测非微分同胚数字变换。
arXiv,
2022.
DOI: 10.48550/arXiv.2212.06060
Abstract:
Producing spatial transformations that are diffeomorphic has been a central problem in deformable image registration. As a diffeomorphic transformation should have positive Jacobian determinant |J| everywhere, the number of voxels …
>>>
Producing spatial transformations that are diffeomorphic has been a central problem in deformable image registration. As a diffeomorphic transformation should have positive Jacobian determinant |J| everywhere, the number of voxels with |J|<0 has been used to test for diffeomorphism and also to measure the irregularity of the transformation. For digital transformations, |J| is commonly approximated using central difference, but this strategy can yield positive |J|'s for transformations that are clearly not diffeomorphic -- even at the voxel resolution level. To show this, we first investigate the geometric meaning of different finite difference approximations of |J|. We show that to determine diffeomorphism for digital images, use of any individual finite difference approximations of |J| is insufficient. We show that for a 2D transformation, four unique finite difference approximations of |J|'s must be positive to ensure the entire domain is invertible and free of folding at the pixel level. We also show that in 3D, ten unique finite differences approximations of |J|'s are required to be positive. Our proposed digital diffeomorphism criteria solves several errors inherent in the central difference approximation of |J| and accurately detects non-diffeomorphic digital transformations.
<<<
翻译
82.
林海onrush
(2022-11-30 21:51):
#paper,https://doi.org/10.48550/arXiv.2211.16197,FJMP: Factorized Joint Multi-Agent Motion Prediction over Learned Directed Acyclic Interaction Graphs,该研究针对自动驾驶轨迹预测生成问题,提出了FJMP,一种学习有向无环相互作用图的因子分解多智能体联合运动预测框架.使用未来场景交互动力学作为稀疏有向交互图,边缘表示agent之间的显式交互,修剪图成有向无环图(DAG)并分解联合预测任务,根据 DAG 的部分排序,其中联合未来轨迹使用有向无环图神经网络DAGNN。在INTERACTION和Argoverse2数据集上,证明了FJMP与非因子化相比能得到准确且场景一致的联合轨迹预测。FJMP在交互的多智能体INTERACTION基准测试上取得SOTA。
arXiv,
2022.
DOI: 10.48550/arXiv.2211.16197
Abstract:
Predicting the future motion of road agents is a critical task in an autonomous driving pipeline. In this work, we address the problem of generating a set of scene-level, or …
>>>
Predicting the future motion of road agents is a critical task in an autonomous driving pipeline. In this work, we address the problem of generating a set of scene-level, or joint, future trajectory predictions in multi-agent driving scenarios. To this end, we propose FJMP, a Factorized Joint Motion Prediction framework for multi-agent interactive driving scenarios. FJMP models the future scene interaction dynamics as a sparse directed interaction graph, where edges denote explicit interactions between agents. We then prune the graph into a directed acyclic graph (DAG) and decompose the joint prediction task into a sequence of marginal and conditional predictions according to the partial ordering of the DAG, where joint future trajectories are decoded using a directed acyclic graph neural network (DAGNN). We conduct experiments on the INTERACTION and Argoverse 2 datasets and demonstrate that FJMP produces more accurate and scene-consistent joint trajectory predictions than non-factorized approaches, especially on the most interactive and kinematically interesting agents. FJMP ranks 1st on the multi-agent test leaderboard of the INTERACTION dataset.
<<<
翻译
83.
张德祥
(2022-11-16 09:16):
#paper https://doi.org/10.48550/arXiv.2206.02063 Active Bayesian Causal Inference :We sequentially design experiments that are maximally informative about our target causal query, collect the corresponding interventional data, and update our beliefs to choose the next experiment;
目前的工作中,我们考虑了更一般的设置,其中我们有兴趣进行因果推理,但没有获得参考因果模型的先验。
在这种情况下,因果发现可以被视为达到目的的手段,而不是主要目标。由于两个原因, 专注于主动学习完整的因果模型以实现随后的因果推理可能是不利的。首先,如果我们只对因果模型的特定方面感兴趣,那么浪费样本来学习完整的因果图是次优的。其次,从少量数据中发现因果关系会带来显著的认知不确定性;
我们提出了主动贝叶斯因果推理(ABCI),这是一个完全贝叶斯框架,用于整合因果发现和推理与实验设计。基本方法是将贝叶斯先验置于选择的因果模型类之上, 并将学习问题作为贝叶斯推理置于模型后验之上。给定未观察的因果模型,我们通过引入目标因果查询来形式化因果推理 ;
我们遵循贝叶斯最优实验设计方法[10,42]然后根据我们当前的信念,在真正的因果模型上选择最能提供我们目标查询信息的可接受的干预。给定观察到的数据,我们然后通过计算因果模型和查询的后验来更新我们的信念,并使用它们来设计下一个实验。
arXiv,
2022.
DOI: 10.48550/arXiv.2206.02063
Abstract:
Causal discovery and causal reasoning are classically treated as separate and consecutive tasks: one first infers the causal graph, and then uses it to estimate causal effects of interventions. However, …
>>>
Causal discovery and causal reasoning are classically treated as separate and consecutive tasks: one first infers the causal graph, and then uses it to estimate causal effects of interventions. However, such a two-stage approach is uneconomical, especially in terms of actively collected interventional data, since the causal query of interest may not require a fully-specified causal model. From a Bayesian perspective, it is also unnatural, since a causal query (e.g., the causal graph or some causal effect) can be viewed as a latent quantity subject to posterior inference -- other unobserved quantities that are not of direct interest (e.g., the full causal model) ought to be marginalized out in this process and contribute to our epistemic uncertainty. In this work, we propose Active Bayesian Causal Inference (ABCI), a fully-Bayesian active learning framework for integrated causal discovery and reasoning, which jointly infers a posterior over causal models and queries of interest. In our approach to ABCI, we focus on the class of causally-sufficient, nonlinear additive noise models, which we model using Gaussian processes. We sequentially design experiments that are maximally informative about our target causal query, collect the corresponding interventional data, and update our beliefs to choose the next experiment. Through simulations, we demonstrate that our approach is more data-efficient than several baselines that only focus on learning the full causal graph. This allows us to accurately learn downstream causal queries from fewer samples while providing well-calibrated uncertainty estimates for the quantities of interest.
<<<
翻译
84.
张德祥
(2022-11-16 08:17):
#paper https://doi.org/10.48550/arXiv.2204.14170
Tractable Uncertainty for Structure Learning
不幸的是,DAGs 的超指数空间使得表示和学习这样的后验概率都极具挑战性。一个重大突破是引入了基于order的表示(Friedman & Koller,2003),其中状态空间被简化为拓扑序的空间,即使这样,任然难于计算。
基于样本的表征对后验的覆盖非常有限,限制了它们所能提供的信息。例如,考虑在给定任意一组所需边的情况下,寻找最可能的图扩展的问题。给定超指数空间,即使是大样本也可能不包含与给定边集一致的单个订单,这使得回答这样的查询是不可能的。 因此需要寻找紧凑的表示。
利用阶模分布中存在的精确的层次条件独立性。这允许OrderSPNs 在相对于其大小的潜在指数级更大的订单集合上表达分布。提供线性时间的Bayesian causal effects因果计算。
arXiv,
2022.
DOI: 10.48550/arXiv.2204.14170
Abstract:
Bayesian structure learning allows one to capture uncertainty over the causal directed acyclic graph (DAG) responsible for generating given data. In this work, we present Tractable Uncertainty for STructure learning …
>>>
Bayesian structure learning allows one to capture uncertainty over the causal directed acyclic graph (DAG) responsible for generating given data. In this work, we present Tractable Uncertainty for STructure learning (TRUST), a framework for approximate posterior inference that relies on probabilistic circuits as the representation of our posterior belief. In contrast to sample-based posterior approximations, our representation can capture a much richer space of DAGs, while also being able to tractably reason about the uncertainty through a range of useful inference queries. We empirically show how probabilistic circuits can be used as an augmented representation for structure learning methods, leading to improvement in both the quality of inferred structures and posterior uncertainty. Experimental results on conditional query answering further demonstrate the practical utility of the representational capacity of TRUST.
<<<
翻译
85.
张德祥
(2022-11-14 14:39):
#paper https://doi.org/10.48550/arXiv.2210.12761 Path integrals, particular kinds, and strange things
FEP 是一个第一原理解释或方法,可以应用于任何“事物”,
以某种方式消除物理学、生物学和心理学之间的界限。
这种应用认可了许多关于感知行为和自组织的规范性解释。
范围从控制论到协同学(敖,2004;阿什比,1979 年;哈肯,1983;凯尔索,2021);
从强化学习到人工好奇心(巴尔托等人,2013;施密德胡伯,1991;萨顿和巴尔托,1981 年;Tsividis 等人,2021 年);
从预测处理到通用计算(Clark,2013bHohwy,2016;赫特,2006);
从模型预测控制到empowerment(Hafner 等人,2020;Klyubin 等人,2005),等等。
文章用统计物理学和信息论的标准结果来解开上面叙述的论点。
arXiv,
None.
DOI: 10.48550/arXiv.2210.12761
Abstract:
This paper describes a path integral formulation of the free energy principle. The ensuing account expresses the paths or trajectories that a particle takes as it evolves over time. The …
>>>
This paper describes a path integral formulation of the free energy principle. The ensuing account expresses the paths or trajectories that a particle takes as it evolves over time. The main results are a method or principle of least action that can be used to emulate the behaviour of particles in open exchange with their external milieu. Particles are defined by a particular partition, in which internal states are individuated from external states by active and sensory blanket states. The variational principle at hand allows one to interpret internal dynamics - of certain kinds of particles - as inferring external states that are hidden behind blanket states. We consider different kinds of particles, and to what extent they can be imbued with an elementary form of inference or sentience. Specifically, we consider the distinction between dissipative and conservative particles, inert and active particles and, finally, ordinary and strange particles. Strange particles (look as if they) infer their own actions, endowing them with apparent autonomy or agency. In short - of the kinds of particles afforded by a particular partition - strange kinds may be apt for describing sentient behaviour.
<<<
翻译
86.
林李泽强
(2022-10-31 23:29):
#paper doi:arxiv.org/abs/2210.09217 Statistical learning methods for neuroimaging data analysis with applications
这是一篇尚未发布得预印本,作者是具有统计学背景的研究人员。
在这篇文章中,作者从统计学的角度全面回顾了从神经成像技术到大规模神经成像研究再到统计学习方法中的统计问题。
文中有三个主要的内容:(1)从统计学视角看待和综述影像处理方法;(2)介绍了当前最前沿的几个神经成像数据集;(3)从统计学视角介绍了9类影像数据的统计方法。
这篇文章从统计学的角度讲述神经成像领域的问题,适合具有数理背景的作为领域入门读物,当然也适合其他背景的研究人员站在统计学角度看待神经成像数据分析中的问题。
arXiv,
2022.
Abstract:
The aim of this paper is to provide a comprehensive review of statistical challenges in neuroimaging data analysis from neuroimaging techniques to large-scale neuroimaging studies to statistical learning methods. We …
>>>
The aim of this paper is to provide a comprehensive review of statistical challenges in neuroimaging data analysis from neuroimaging techniques to large-scale neuroimaging studies to statistical learning methods. We briefly review eight popular neuroimaging techniques and their potential applications in neuroscience research and clinical translation. We delineate the four common themes of neuroimaging data and review major image processing analysis methods for processing neuroimaging data at the individual level. We briefly review four large-scale neuroimaging-related studies and a consortium on imaging genomics and discuss four common themes of neuroimaging data analysis at the population level. We review nine major population-based statistical analysis methods and their associated statistical challenges and present recent progress in statistical methodology to address these challenges.
<<<
翻译
87.
song
(2022-10-31 12:02):
#paper Conditional Diffusion Probabilistic Model for Speech Enhancement, https://arxiv.org/abs/2202.05256# 一般的扩散模型在speech相关的task上表现并不优秀,原因是扩散模型假设所有的噪音是符合高斯分布的,而在speech任务中只有少量噪音的高斯噪音(白噪音)更多的是各种stationary和non-stationary noise。本文解决这一问题的方法是在reverse和diffuse过程中除了基于上一步的输出外,还基于一个带噪声语音,y,从每一步乘以一个高斯噪音变成乘以带噪声语音于当前步语音的差于高斯噪音的积。在这个过程中模型学到了带噪声语音(非高斯噪音)的特征。这个方法解决了非高斯分布数据使用扩散模型的问题。但语音增强问题有其特殊性,语音增强任务的数据集本身就带有干净语音和噪声语音,使这个任务较为适合这个方法,其他语音任务不一定会有干净语音作为输入。比如语音转换任务就没有大量目标语音作为干净语音输入,可以在此基础上再做研究
arXiv,
2022.
Abstract:
Speech enhancement is a critical component of many user-oriented audio applications, yet current systems still suffer from distorted and unnatural outputs. While generative models have shown strong potential in speech …
>>>
Speech enhancement is a critical component of many user-oriented audio applications, yet current systems still suffer from distorted and unnatural outputs. While generative models have shown strong potential in speech synthesis, they are still lagging behind in speech enhancement. This work leverages recent advances in diffusion probabilistic models, and proposes a novel speech enhancement algorithm that incorporates characteristics of the observed noisy speech signal into the diffusion and reverse processes. More specifically, we propose a generalized formulation of the diffusion probabilistic model named conditional diffusion probabilistic model that, in its reverse process, can adapt to non-Gaussian real noises in the estimated speech signal. In our experiments, we demonstrate strong performance of the proposed approach compared to representative generative models, and investigate the generalization capability of our models to other datasets with noise characteristics unseen during training.
<<<
翻译
88.
林海onrush
(2022-10-29 13:58):
#paper,Model Evaluation, Model Selection, and Algorithm
Selection in Machine Learning , url : https://arxiv.org/abs/1811.12808#,
本论文回顾了用于解决模型评估、模型选择和算法选择三项任务的不同技术,并参考理论和实证研究讨
论了每一项技术的主要优势和劣势。进而,给出建议以促进机器学习研究与应用方面的最佳实践。
详细论文解析见下面pdf
arXiv,
2018.
DOI: 10.48550/arXiv.1811.12808
Abstract:
The correct use of model evaluation, model selection, and algorithm selection techniques is vital in academic machine learning research as well as in many industrial settings. This article reviews different …
>>>
The correct use of model evaluation, model selection, and algorithm selection techniques is vital in academic machine learning research as well as in many industrial settings. This article reviews different techniques that can be used for each of these three subtasks and discusses the main advantages and disadvantages of each technique with references to theoretical and empirical studies. Further, recommendations are given to encourage best yet feasible practices in research and applications of machine learning. Common methods such as the holdout method for model evaluation and selection are covered, which are not recommended when working with small datasets. Different flavors of the bootstrap technique are introduced for estimating the uncertainty of performance estimates, as an alternative to confidence intervals via normal approximation if bootstrapping is computationally feasible. Common cross-validation techniques such as leave-one-out cross-validation and k-fold cross-validation are reviewed, the bias-variance trade-off for choosing k is discussed, and practical tips for the optimal choice of k are given based on empirical evidence. Different statistical tests for algorithm comparisons are presented, and strategies for dealing with multiple comparisons such as omnibus tests and multiple-comparison corrections are discussed. Finally, alternative methods for algorithm selection, such as the combined F-test 5x2 cross-validation and nested cross-validation, are recommended for comparing machine learning algorithms when datasets are small.
<<<
翻译
89.
林海onrush
(2022-10-29 13:51):
#paper,Formal Algorithms for Transformers,url:https://arxiv.org/pdf/2207.09238.pdf,在过去5年多的时间里,Transfermers在多个领域表现出惊人的效果。但是,对于Transformers算法的描述基本都集中在使用图形、文字描述、或针对优化部分的解释,并没有一篇论文给出一个较为完整的Algorithm伪代码。deepmind官方给出了形式化算法伪代码,论文详解见下面PDF
arXiv,
2022.
DOI: 10.48550/arXiv.2207.09238
Abstract:
This document aims to be a self-contained, mathematically precise overview of transformer architectures and algorithms (*not* results). It covers what transformers are, how they are trained, what they are used …
>>>
This document aims to be a self-contained, mathematically precise overview of transformer architectures and algorithms (*not* results). It covers what transformers are, how they are trained, what they are used for, their key architectural components, and a preview of the most prominent models. The reader is assumed to be familiar with basic ML terminology and simpler neural network architectures such as MLPs.
<<<
翻译
90.
林海onrush
(2022-10-29 13:25):
#paper,CAUSAL DISCOVERY WITH REINFORCEMENT
LEARNING,论文地址:https://arxiv.org/pdf/1906.04477.pdf,官方视频介绍:https://iclr.cc/virtual_2020/poster_S1g2skStPB.html,
因果研究作为下一个潜在的热点,已经吸引了机器学习/深度学习领域的的广泛关注,因果研究中一个经典的问题是「因果发现」问题——从被动可观测的数据中发现潜在的因果图结构。
此论文是华为诺亚方舟实验室被 ICLR 2020 接收的一篇满分论文。在此论文中,华为诺亚方舟实验室因果研究团队将强化学习应用到打分法的因果发现算法中,通过基于自注意力机制的 encoder-decoder 神经网络模型探索数据之间的关系,结合因果结构的条件,并使用策略梯度的强化学习算法对神经网络参数进行训练,最终得到因果图结构。在学术界常用的一些数据模型中,该方法在中等规模的图上的表现优于其他方法,包括传统的因果发现算法和近期的基于梯度的算法。同时该方法非常灵活,可以和任意的打分函数结合使用。
arXiv,
2019.
DOI: 10.48550/arXiv.1906.04477
Abstract:
Discovering causal structure among a set of variables is a fundamental problem in many empirical sciences. Traditional score-based casual discovery methods rely on various local heuristics to search for a …
>>>
Discovering causal structure among a set of variables is a fundamental problem in many empirical sciences. Traditional score-based casual discovery methods rely on various local heuristics to search for a Directed Acyclic Graph (DAG) according to a predefined score function. While these methods, e.g., greedy equivalence search, may have attractive results with infinite samples and certain model assumptions, they are usually less satisfactory in practice due to finite data and possible violation of assumptions. Motivated by recent advances in neural combinatorial optimization, we propose to use Reinforcement Learning (RL) to search for the DAG with the best scoring. Our encoder-decoder model takes observable data as input and generates graph adjacency matrices that are used to compute rewards. The reward incorporates both the predefined score function and two penalty terms for enforcing acyclicity. In contrast with typical RL applications where the goal is to learn a policy, we use RL as a search strategy and our final output would be the graph, among all graphs generated during training, that achieves the best reward. We conduct experiments on both synthetic and real datasets, and show that the proposed approach not only has an improved search ability but also allows a flexible score function under the acyclicity constraint.
<<<
翻译
91.
尹志
(2022-10-27 20:44):
#paper doi: https://doi.org/10.48550/arXiv.1708.02002,Focal Loss for Dense Object Detection. (ICCV 2017) 这是一篇目标检测领域的经典的论文,我们知道,一直以来,目标检测领域有两类模型,单阶段和二阶段检测模型。前者以yolo和ssd为主,后者基本上是R-CNN派生出来的。一般而言,单阶段的目标检测算法速度快于二阶段检测算法,而准确性上弱于二阶段算法。原理上,二阶段检测算法基本是第一步生成一堆的候选目标框,然后第二步精准分类这些候选目标框;而单阶段检测算法是直接生成一堆(大量)的检测框。那么是不是提出一个单阶段的检测算法,速度也快,准确性也可以媲美二阶段算法呢?文章认为,单阶段在准确性上目前比不过二阶段算法的原因,是因为存在类别不平衡的问题。在二阶段算法中,我们通过第一阶段已经过滤了大多数的背景样本了,但单阶段算法一次生成的候选框非常密集,其中前景-背景类别的不平衡就非常严重,这也导致准确率上不去。因此作者提出,我们在常规的交叉熵里引入一个缩放因子,这个缩放因子在训练中能够自动对容易的样本进行降权重,从而让模型能更好的处理难例。这就是大名鼎鼎的focal loss。基于focal loss,作者设计了一个单阶段目标检测网络:RetinaNet, 通过实验对比,RetinaNet不论在速度上还是准确性上,都获得了SOTA的性能,在COCO数据集上获得了39.1的AP(这在当年是非常优秀的成绩)
arXiv,
2018.
DOI: 10.48550/arXiv.1708.02002
Abstract:
The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In …
>>>
The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In this paper, we investigate why this is the case. We discover that the extreme foreground-background class imbalance encountered during training of dense detectors is the central cause. We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. To evaluate the effectiveness of our loss, we design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors. Code is at: this https URL.
<<<
翻译
92.
王昊
(2022-10-25 10:11):
#paper doi: 10.48550/arXiv.2110.07342
So Yeon Min, Devendra Singh Chaplot, Pradeep Ravikumar, Yonatan Bisk, and Ruslan Salakhutdinov. 2022. FILM: Following Instructions in Language with Modular Methods. Retrieved July 13, 2022 from http://arxiv.org/abs/2110.07342.
应用于视觉语言导航任务的算法文章,目前在ALFRED数据集下排名第4的方法。本文提出了一种具有结构化表示的模块化方法,(1)构建场景的语义地图,(2)使用语义搜索策略进行探索,以实现自然语言目标。Film的四个组件:1.将语言指令转换成结构化形式(语言处理)2.将以自我为中心的视觉输入转换为语义度量图(语义映射)3. 将以自我为中心的视觉输入转换为语义度量图(语义搜索策略)4. 输出后续导航/交互操作(确定性策略)。FILM不需要任何提供顺序指导的输入,即专家轨迹或低级语言指令(用来指导顺序)。
arXiv,
2022.
DOI: 10.48550/arXiv.2110.07342
Abstract:
Recent methods for embodied instruction following are typically trained end-to-end using imitation learning. This often requires the use of expert trajectories and low-level language instructions. Such approaches assume that neural …
>>>
Recent methods for embodied instruction following are typically trained end-to-end using imitation learning. This often requires the use of expert trajectories and low-level language instructions. Such approaches assume that neural states will integrate multimodal semantics to perform state tracking, building spatial memory, exploration, and long-term planning. In contrast, we propose a modular method with structured representations that (1) builds a semantic map of the scene and (2) performs exploration with a semantic search policy, to achieve the natural language goal. Our modular method achieves SOTA performance (24.46 %) with a substantial (8.17 % absolute) gap from previous work while using less data by eschewing both expert trajectories and low-level instructions. Leveraging low-level language, however, can further increase our performance (26.49 %). Our findings suggest that an explicit spatial memory and a semantic search policy can provide a stronger and more general representation for state-tracking and guidance, even in the absence of expert trajectories or low-level instructions.
<<<
翻译
93.
张浩彬
(2022-10-20 16:20):
#paper 1.Unsupervised Scalable Representation Learning for Multivariate Time Series,https://doi.org/10.48550/arXiv.1901.10738
论文关键是:正负样本构造, triplet loss以及因果空洞卷积
适用:该无监督学习模型可以用于不定长的序列;短序列及长序列均可使用;
代码:https://github.com/White-Link/UnsupervisedScalableRepresentationLearningTimeSeries
正负样本构造:
有N个序列对于某序列,随机选择长度,构造一个子序列ref。在这个子序列中,随机抽样一个子序列作为正样本pos;从其他序列(如果有的话)中随机抽样K个作为负样本neg;其中K是超参数
编码器有三个要求:(1)能够提取序列特征;(2)允许变长输入;(3)可以节省时间和内存;(个人觉得,只是为了给使用卷积找的理由);因此使用exponentially dilated causal convolutions作为特征提取器代替传统的rnn、lstm
改造的triplet loss
在时间序列分类任务中结果表明由于现有的无监督方法,并且不差于有监督方法。在序列预测任务中,没做太多的比较
在单序列分类任务:使用了UCR数据集上的所有时间序列分类任务
arXiv,
2019.
DOI: 10.48550/arXiv.1901.10738
Abstract:
Time series constitute a challenging data type for machine learning algorithms, due to their highly variable lengths and sparse labeling in practice. In this paper, we tackle this challenge by …
>>>
Time series constitute a challenging data type for machine learning algorithms, due to their highly variable lengths and sparse labeling in practice. In this paper, we tackle this challenge by proposing an unsupervised method to learn universal embeddings of time series. Unlike previous works, it is scalable with respect to their length and we demonstrate the quality, transferability and practicability of the learned representations with thorough experiments and comparisons. To this end, we combine an encoder based on causal dilated convolutions with a novel triplet loss employing time-based negative sampling, obtaining general-purpose representations for variable length and multivariate time series.
<<<
翻译
94.
张德祥
(2022-10-18 10:58):
#paper https://doi.org/10.48550/arXiv.2208.10601Deriving time-averaged active inference from control principles 通过观察随时反馈调整规划的理论实现, 假设固定的动作空间和前馈规划,这可能导致非常高维的递归优化问题。这些假设在经验上和计算上都是有问题的。有机体并不是生来就知道[9];他们学习[40]. 噪音[13,32], 不确定[23], 和可变性[47] 在运动控制方面不够完善,因此必须通过在线反馈来稳定运动控制。
随机最优反馈控制需要一个最优性原则,允许在行动步骤之间整合观察。而不是递归优化单独的动作,通过观察随时反馈调整规划序列。
尽管优化了“全局”(不确定)惊奇率(等式),它只需要在情境中规划和调整行为。
泰德帕里和 Ok[55] 1998 年发表了第一个基于模型的 RL 算法,而 Baxter 和 Bartlett[5] 给出了有偏的政策梯度估计量。亚历山大和布朗又花了十年时间[2]以给出平均成本时间差异学习的递归分解。张与罗斯[61] 直到最近,我才首次发表了“深度”强化学习算法(基于函数逼近)对平均成本标准的适应,该标准仍然是无模型的。Jafarnia-Jahromi 等人[26]最近给出了第一个算法 , 用 于 求 解 具 有 已 知 观 测 密 度 和 未 知 动 态 的 无 限 时 域 平 均 代 价 部 分 可 观 测 问 题 。
结论 这结束了主动推理的无限视野、平均惊奇公式的推导。由于我们的公式将行为情节置于情境中,所以尽管优化了“全局”(不确定)惊奇率(等式),它只需要在情境中规划和调整行为(例如,从时间步长 1 到 T)15). 我们认为,这种积极推理公式可以推进基于模型的概率方法,分层反馈控制[40,33].
arXiv,
2022.
DOI: 10.48550/arXiv.2208.10601
Abstract:
Active inference offers a principled account of behavior as minimizing average sensory surprise over time. Applications of active inference to control problems have heretofore tended to focus on finite-horizon or …
>>>
Active inference offers a principled account of behavior as minimizing average sensory surprise over time. Applications of active inference to control problems have heretofore tended to focus on finite-horizon or discounted-surprise problems, despite deriving from the infinite-horizon, average-surprise imperative of the free-energy principle. Here we derive an infinite-horizon, average-surprise formulation of active inference from optimal control principles. Our formulation returns to the roots of active inference in neuroanatomy and neurophysiology, formally reconnecting active inference to optimal feedback control. Our formulation provides a unified objective functional for sensorimotor control and allows for reference states to vary over time.
<<<
翻译
95.
Arwen
(2022-09-30 23:41):
#paper doi:https://doi.org/10.48550/arXiv.2202.02000,Cross-Modality Multi-Atlas Segmentation via Deep Registration and Label Fusion 基于多图谱的分割技术是医学影像分割问题中一个比较有效的方法。一般来说,多图谱技术通过将多个图谱非线性配准到个体图像,并将对应的图谱分割图变换到个体图像空间,并利用融合算法融合多图谱分割图得到个体图像的分割图。但是,传统的多图谱分割技术受限两点:一是配准过程计算量太大,二是标签融合算法会影响到最终分割图的精度。这篇文章构建了两个神经网络,一个网络用于生成形变场,将图谱映射到个体空间,另一个网络用于计算各个图谱分割标签的融合权重,用于后续的分割图融合。不过这篇文章做的一般,我个人觉得不咋地。配准网络部分明明使用scaling and squaring算法就可以生成合理的形变场,非要做没啥必要的创新,应该就是强行扩充文章内容吧。
arXiv,
2022.
DOI: 10.48550/arXiv.2202.02000
Abstract:
Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation. Generally, MAS methods register multiple atlases, i.e., medical images with corresponding labels, to a target image; and the transformed …
>>>
Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation. Generally, MAS methods register multiple atlases, i.e., medical images with corresponding labels, to a target image; and the transformed atlas labels can be combined to generate target segmentation via label fusion schemes. Many conventional MAS methods employed the atlases from the same modality as the target image. However, the number of atlases with the same modality may be limited or even missing in many clinical applications. Besides, conventional MAS methods suffer from the computational burden of registration or label fusion procedures. In this work, we design a novel cross-modality MAS framework, which uses available atlases from a certain modality to segment a target image from another modality. To boost the computational efficiency of the framework, both the image registration and label fusion are achieved by well-designed deep neural networks. For the atlas-to-target image registration, we propose a bi-directional registration network (BiRegNet), which can efficiently align images from different modalities. For the label fusion, we design a similarity estimation network (SimNet), which estimates the fusion weight of each atlas by measuring its similarity to the target image. SimNet can learn multi-scale information for similarity estimation to improve the performance of label fusion. The proposed framework was evaluated by the left ventricle and liver segmentation tasks on the MM-WHS and CHAOS datasets, respectively. Results have shown that the framework is effective for cross-modality MAS in both registration and label fusion.
<<<
翻译
96.
Ricardo
(2022-09-30 23:32):
#paper doi:https://doi.org/10.48550/arXiv.2202.03563,Aladdin: Joint Atlas Building and Diffeomorphic Registration Learning with Pairwise Alignment 图谱构建和图像配准是医学影像分析中的重要任务,但是图谱估计和无参形变的计算需要极高的计算代价。此外,以前的图谱构建方法通常计算模糊图谱和每个单独的图像之间的相似度驱动模型优化,这可能会增加预估的图谱和个体图像之间配准的难度,因为预估的模糊图谱相比个体图像不具有更清楚的解剖结构。这篇文章基于forward model从多个角度约束了图谱的生成空间,并做了充足的理论分析。但是由于模型较为复杂,并且涉及所有图像的同时优化,所以不太适合3d图像数据,目前还只是在2d图像数据上做实验。
arXiv,
2022.
DOI: 10.48550/arXiv.2202.03563
Abstract:
Atlas building and image registration are important tasks for medical image analysis. Once one or multiple atlases from an image population have been constructed, commonly (1) images are warped into …
>>>
Atlas building and image registration are important tasks for medical image analysis. Once one or multiple atlases from an image population have been constructed, commonly (1) images are warped into an atlas space to study intra-subject or inter-subject variations or (2) a possibly probabilistic atlas is warped into image space to assign anatomical labels. Atlas estimation and nonparametric transformations are computationally expensive as they usually require numerical optimization. Additionally, previous approaches for atlas building often define similarity measures between a fuzzy atlas and each individual image, which may cause alignment difficulties because a fuzzy atlas does not exhibit clear anatomical structures in contrast to the individual images. This work explores using a convolutional neural network (CNN) to jointly predict the atlas and a stationary velocity field (SVF) parameterization for diffeomorphic image registration with respect to the atlas. Our approach does not require affine pre-registrations and utilizes pairwise image alignment losses to increase registration accuracy. We evaluate our model on 3D knee magnetic resonance images (MRI) from the OAI-ZIB dataset. Our results show that the proposed framework achieves better performance than other state-of-the-art image registration algorithms, allows for end-to-end training, and for fast inference at test time.
<<<
翻译
97.
林海onrush
(2022-09-30 22:25):
#paper arXiv, 2209.00796 (2022) , Diffusion Models: A Comprehensive Survey of Methods and Applications, Diffusion model在诸多领域都有着优异的表现,并且考虑到不同领域的应用中diffusion model产生了不同的变形,论文系统地介绍了diffusion model的应用研究,其中包含如下领域:计算机视觉,NLP、波形信号处理、多模态建模、分子图建模、时间序列建模、对抗性净化。工作的主要贡献总结如下:新的分类方法:我们对扩散模型和其应用提出了一种新的、系统的分类法。具体将模型分为三类:采样速度增强、最大似然估计增强、数据泛化增强。进一步地,将扩散模型的应用分为七类:计算机视觉,NLP、波形信号处理、多模态建模、分子图建模、时间序列建模、对抗性净化。全面地概述了现代扩散模型及其应用,展示了每种扩散模型的主要改进,和原始模型进行了必要的比较,并总结了相应的论文。扩散模型的基本思想是正向扩散过程来系统地扰动数据中的分布,然后通过学习反向扩散过程恢复数据的分布,这样就了产生一个高度灵活且易于计算的生成模型。
arXiv,
2022.
DOI: 10.48550/arXiv.2209.00796
Abstract:
Diffusion models are a class of deep generative models that have shown impressive results on various tasks with a solid theoretical foundation. Despite demonstrated success than state-of-the-art approaches, diffusion models …
>>>
Diffusion models are a class of deep generative models that have shown impressive results on various tasks with a solid theoretical foundation. Despite demonstrated success than state-of-the-art approaches, diffusion models often entail costly sampling procedures and sub-optimal likelihood estimation. Significant efforts have been made to improve the performance of diffusion models in various aspects. In this article, we present a comprehensive review of existing variants of diffusion models. Specifically, we provide the taxonomy of diffusion models and categorize them into three types: sampling-acceleration enhancement, likelihood-maximization enhancement, and data-generalization enhancement. We also introduce the other generative models (i.e., variational autoencoders, generative adversarial networks, normalizing flow, autoregressive models, and energy-based models) and discuss the connections between diffusion models and these generative models. Then we review the applications of diffusion models, including computer vision, natural language processing, waveform signal processing, multi-modal modeling, molecular graph generation, time series modeling, and adversarial purification. Furthermore, we propose new perspectives pertaining to the development of generative models. Github: this https URL.
<<<
翻译
98.
尹志
(2022-09-30 11:06):
#paper doi:10.48550/arXiv.1907.10830 U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation, ICLR 2020. 这又是一篇图像翻译的文章,还是在网络结构上做了有效的改进。作者通过提出一个新的注意力模块和一种新的归一化函数实现无监督的图像翻译工作。作者提出的注意力模块对于图像的几何形变能够做出很好的处理,这也让文章的架构对于很多艺术风格的变化处理具有优越的效果。
arXiv,
2019.
DOI: 10.48550/arXiv.1907.10830
Abstract:
We propose a novel method for unsupervised image-to-image translation, which incorporates a new attention module and a new learnable normalization function in an end-to-end manner. The attention module guides our …
>>>
We propose a novel method for unsupervised image-to-image translation, which incorporates a new attention module and a new learnable normalization function in an end-to-end manner. The attention module guides our model to focus on more important regions distinguishing between source and target domains based on the attention map obtained by the auxiliary classifier. Unlike previous attention-based method which cannot handle the geometric changes between domains, our model can translate both images requiring holistic changes and images requiring large shape changes. Moreover, our new AdaLIN (Adaptive Layer-Instance Normalization) function helps our attention-guided model to flexibly control the amount of change in shape and texture by learned parameters depending on datasets. Experimental results show the superiority of the proposed method compared to the existing state-of-the-art models with a fixed network architecture and hyper-parameters. Our code and datasets are available at this https URL or this https URL.
<<<
翻译
99.
前进
(2022-09-29 12:12):
#paper Affine Medical Image Registration with Coarse-to-Fine Vision Transformer Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 20835-20844
仿射配准是综合医学图像配准中不可缺少的一部分。然而,关于快速、鲁棒的仿射配准算法的研究很少。这些研究大多都是联合仿射和变形配准的CNN模型,而对仿射子网络的独立性能研究较少。此外,现有的基于CNN的仿射配准方法要么关注输入的局部错位,要么关注输入的全局方向和位置,以预测仿射变换矩阵,这种方法对空间初始化敏感,泛化能力有限。这篇论文提出了一种快速、鲁棒的基于学习的三维仿射医学图像配准算法C2FViT。该方法自然地利用Transformer的全局连通性和CNN的局部性以及多分辨率策略来学习全局仿射配准,并且在3D脑图谱配准中评估了该方法。结果表明该方法在配准精度、鲁棒性、配准速度和泛化性都表现良好。
arXiv,
2022.
DOI: 10.48550/arXiv.2203.15216
Abstract:
Affine registration is indispensable in a comprehensive medical image registration pipeline. However, only a few studies focus on fast and robust affine registration algorithms. Most of these studies utilize convolutional …
>>>
Affine registration is indispensable in a comprehensive medical image registration pipeline. However, only a few studies focus on fast and robust affine registration algorithms. Most of these studies utilize convolutional neural networks (CNNs) to learn joint affine and non-parametric registration, while the standalone performance of the affine subnetwork is less explored. Moreover, existing CNN-based affine registration approaches focus either on the local misalignment or the global orientation and position of the input to predict the affine transformation matrix, which are sensitive to spatial initialization and exhibit limited generalizability apart from the training dataset. In this paper, we present a fast and robust learning-based algorithm, Coarse-to-Fine Vision Transformer (C2FViT), for 3D affine medical image registration. Our method naturally leverages the global connectivity and locality of the convolutional vision transformer and the multi-resolution strategy to learn the global affine registration. We evaluate our method on 3D brain atlas registration and template-matching normalization. Comprehensive results demonstrate that our method is superior to the existing CNNs-based affine registration methods in terms of registration accuracy, robustness and generalizability while preserving the runtime advantage of the learning-based methods. The source code is available at this https URL.
<<<
翻译
100.
张浩彬
(2022-09-21 11:01):
#paper https://doi.org/10.48550/arXiv.2106.00750
Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding
21年ICLR论文,时间序列对比学习
代码:https://github.com/sanatonek/TNC_ representation_learning
样本的选择思想是,认为领域内的信号是相似的,领域外的信号是需要区分的
正样本的选择:邻域的信号都是服从某个高斯分布,均值为t*,方差是窗口大小和邻域长度.领域内是正样本正样本。如果确定邻域,使用ADF检验。
负样本:不在邻域内的就是负样本,但是这一点,作者在损失函数里进一步优化了
损失函数:作者认为,不在一个领域不能都认为是负样本,因为时序问题具有周期性,因此应该把它归为正无标记样本(即正类和负类混合)。在处理上,根据PU学习的一些经验,它在上面的负样本中引入权重,同时进入损失函数。、
数据:总共3个数据:1个模拟数据(4个类别,HMM生成),1个医疗临床房颤数据(MIT-BIH,特点是类别交替进行,类别非常不平衡,少量个体(人)具体非常长的数据),1个人类活动数据(UCI-HAR数据)
下游任务:聚类与分类,其中主要目标是为了尽可能比较表征学习,因此对于同一任务,不同的模型都用了相同的,并且简单的编码器结构。由于不同数据集特点不一样,因此不同任务的编码器不同。
聚类用了简单的kmeans;分类用了简单的knn;本文的TNC都取得了最好的结果
arXiv,
2021.
DOI: 10.48550/arXiv.2106.00750
Abstract:
Time series are often complex and rich in information but sparsely labeled and therefore challenging to model. In this paper, we propose a self-supervised framework for learning generalizable representations for …
>>>
Time series are often complex and rich in information but sparsely labeled and therefore challenging to model. In this paper, we propose a self-supervised framework for learning generalizable representations for non-stationary time series. Our approach, called Temporal Neighborhood Coding (TNC), takes advantage of the local smoothness of a signal's generative process to define neighborhoods in time with stationary properties. Using a debiased contrastive objective, our framework learns time series representations by ensuring that in the encoding space, the distribution of signals from within a neighborhood is distinguishable from the distribution of non-neighboring signals. Our motivation stems from the medical field, where the ability to model the dynamic nature of time series data is especially valuable for identifying, tracking, and predicting the underlying patients' latent states in settings where labeling data is practically impossible. We compare our method to recently developed unsupervised representation learning approaches and demonstrate superior performance on clustering and classification tasks for multiple datasets.
<<<
翻译