来自杂志 arXiv 的文献。
当前共找到 142 篇文献分享,本页显示第 81 - 100 篇。
81.
张浩彬
(2023-04-28 13:45):
#paper An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling DOI:arXiv:1803.01271 .
最近密集地做时序问题的分享,认真看了一下TCN的原文.除了RNN那一套,TCN还是用得比较多。为了在不增加太多层的情况下实现大的感受野,通过空洞卷积来实现,并通过padding和裁剪的方式避免了数据泄露问题。一个TCN块有两个空洞因果卷积,激活层,norm层以及一个残差链接组成。实验证明了TCN的超参数相对不敏感,但卷积核大小k是个关键,另外drop out 和梯度裁剪也有较大的帮助。
arXiv,
2018.
DOI: 10.48550/arXiv.1803.01271
Abstract:
For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and …
>>>
For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory. We conclude that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutional networks should be regarded as a natural starting point for sequence modeling tasks. To assist related work, we have made code available at this http URL .
<<<
翻译
82.
姗姗来迟
(2023-04-19 13:44):
#paper arXiv:2103.00020 Learning Transferable Visual Models From Natural Language Supervision
前天拜读了CLIP论文并去了解了一下论文中提到的prompt
拜读笔记见博文:CLIP论文拜读及理解
链接:https://blog.csdn.net/weixin_44845357/article/details/130206779
arXiv,
2021.
DOI: 10.48550/arXiv.2103.00020
Abstract:
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is …
>>>
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstrate that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet. After pre-training, natural language is used to reference learned visual concepts (or describe new ones) enabling zero-shot transfer of the model to downstream tasks. We study the performance of this approach by benchmarking on over 30 different existing computer vision datasets, spanning tasks such as OCR, action recognition in videos, geo-localization, and many types of fine-grained object classification. The model transfers non-trivially to most tasks and is often competitive with a fully supervised baseline without the need for any dataset specific training. For instance, we match the accuracy of the original ResNet-50 on ImageNet zero-shot without needing to use any of the 1.28 million training examples it was trained on. We release our code and pre-trained model weights at this https URL.
<<<
翻译
83.
张德祥
(2023-04-16 11:20):
#paper https://doi.org/10.48550/arXiv.2302.10051 一种用于理解神经计算算法基础的既定规范方法是从原则计算目 标中导出在线算法, 并评估它们与解剖学和生理学观察的兼容性。
相似性匹配目标已成为成功导出在线算法的起点, 这些算法映射到具有点神经元和 Hebbian/anti‐Hebbian 可塑性的神经网络 (NN)。这些神经网络模型解释了许多解剖学和生理学观察; 然而, 这些目 标的计算能力有限, 并且派生的 NN 无法解释在整个大脑中普遍存在的多隔室神经元结构和非赫布形式的可塑性。在本文中, 我们回顾并统一了相似性匹配方法的最新扩展, 以解决更复杂的目 标, 包括范围广泛的无监督和自 监督学习任务, 这些任务可以表述为广义特征值问题或非负矩阵分解问题。有趣的是, 源自这些目 标的在线算法自 然地映射到具有多隔室神经元和局部非赫布学习规则的神经网络。
因此, 这种相似性匹配方法的统一扩展提供了一个规范框架, 有助于理解整个大脑中发现的多区室神经元结构和非赫布可塑性。
arXiv,
2023.
DOI: 10.48550/arXiv.2302.10051
Abstract:
An established normative approach for understanding the algorithmic basis of neural computation is to derive online algorithms from principled computational objectives and evaluate their compatibility with anatomical and physiological observations. …
>>>
An established normative approach for understanding the algorithmic basis of neural computation is to derive online algorithms from principled computational objectives and evaluate their compatibility with anatomical and physiological observations. Similarity matching objectives have served as successful starting points for deriving online algorithms that map onto neural networks (NNs) with point neurons and Hebbian/anti-Hebbian plasticity. These NN models account for many anatomical and physiological observations; however, the objectives have limited computational power and the derived NNs do not explain multi-compartmental neuronal structures and non-Hebbian forms of plasticity that are prevalent throughout the brain. In this article, we review and unify recent extensions of the similarity matching approach to address more complex objectives, including a broad range of unsupervised and self-supervised learning tasks that can be formulated as generalized eigenvalue problems or nonnegative matrix factorization problems. Interestingly, the online algorithms derived from these objectives naturally map onto NNs with multi-compartmental neurons and local, non-Hebbian learning rules. Therefore, this unified extension of the similarity matching approach provides a normative framework that facilitates understanding the multi-compartmental neuronal structures and non-Hebbian plasticity found throughout the brain.
<<<
翻译
84.
林海onrush
(2023-03-31 23:17):
#paper, BloombergGPT: A Large Language Model for Finance, doi:10.48550/arXiv.2303.17564, ChatGPT引爆的AI热潮也“烧到了”金融圈,彭博社重磅发布为金融界打造的大型语言模型(LLM)——BloombergGPT。3月30日,根据彭博社最新发布的报告显示,其构建迄今为止最大的特定领域数据集,并训练了专门用于金融领域的LLM,开发了拥有500亿参数的语言模型——BloombergGPT。报告显示,该模型依托彭博社的大量金融数据源,构建了一个3630亿个标签的数据集,支持金融行业内的各类任务。该模型在金融任务上的表现远超过现有模型,且在通用场景上的表现与现有模型也能一较高下。报告指出,从测试来看,BloombergGPT在五项任务中的四项(ConvFinQA,FiQA SA,FPB和Headline)表现最佳,在NER(Named Entity Recognition)中排名第二。因此,BloombergGPT有其优势性。
arXiv,
2023.
DOI: 10.48550/arXiv.2303.17564
Abstract:
The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models …
>>>
The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific dataset yet, augmented with 345 billion tokens from general purpose datasets. We validate BloombergGPT on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage. Our mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks. Additionally, we explain our modeling choices, training process, and evaluation methodology. As a next step, we plan to release training logs (Chronicles) detailing our experience in training BloombergGPT.
<<<
翻译
85.
Vincent
(2023-03-31 15:34):
#paper https://doi.org/10.48550/arXiv.1904.10098 ICML 2019 DAG-GNN: DAG Structure Learning with Graph Neural Networks. 有向无环图(DAG)的结构学习是一项十分具有挑战性的工作,其搜索空间随着节点数的增多而呈现指数式的增长。常用的研究手段是将结构学习转化为一种score的优化问题。为了让问题可解,传统的方法通常考虑线性结构方程模型(Linear SEM),这篇文章基于线性SEM的框架,发展了一套基于变分自编码器VAE和图神经网络GNN的DAG学习方法,得益于神经网络的非线性拟合,这套方法在保证至少比线性SEM好的情况下还能解决一些非线性的问题。通过数据仿真和真实数据的学习,文章验证了该方法的准确度比线性SEM好,假发现率比线性SEM低。
arXiv,
2019.
DOI: 10.48550/arXiv.1904.10098
Abstract:
Learning a faithful directed acyclic graph (DAG) from samples of a joint distribution is a challenging combinatorial problem, owing to the intractable search space superexponential in the number of graph …
>>>
Learning a faithful directed acyclic graph (DAG) from samples of a joint distribution is a challenging combinatorial problem, owing to the intractable search space superexponential in the number of graph nodes. A recent breakthrough formulates the problem as a continuous optimization with a structural constraint that ensures acyclicity (Zheng et al., 2018). The authors apply the approach to the linear structural equation model (SEM) and the least-squares loss function that are statistically well justified but nevertheless limited. Motivated by the widespread success of deep learning that is capable of capturing complex nonlinear mappings, in this work we propose a deep generative model and apply a variant of the structural constraint to learn the DAG. At the heart of the generative model is a variational autoencoder parameterized by a novel graph neural network architecture, which we coin DAG-GNN. In addition to the richer capacity, an advantage of the proposed model is that it naturally handles discrete variables as well as vector-valued ones. We demonstrate that on synthetic data sets, the proposed method learns more accurate graphs for nonlinearly generated samples; and on benchmark data sets with discrete variables, the learned graphs are reasonably close to the global optima. The code is available at \url{this https URL}.
<<<
翻译
86.
姗姗来迟
(2023-03-27 15:44):
#paper arXiv:2201.11903
chain of thought Prompting elicits reasoning in large language models
阅读笔记被记录在本人的博文中:https://blog.csdn.net/weixin_44845357/article/details/129566376
主要是了解思维链(通过逐步回答示例来引出复杂的多步推理的技术)
arXiv,
2022.
Abstract:
We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular, …
>>>
We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular, we show how such reasoning abilities emerge naturally in sufficiently large language models via a simple method called chain of thought prompting, where a few chain of thought demonstrations are provided as exemplars in prompting. Experiments on three large language models show that chain of thought prompting improves performance on a range of arithmetic, commonsense, and symbolic reasoning tasks. The empirical gains can be striking. For instance, prompting a 540B-parameter language model with just eight chain of thought exemplars achieves state of the art accuracy on the GSM8K benchmark of math word problems, surpassing even finetuned GPT-3 with a verifier.
<<<
翻译
87.
张德祥
(2023-03-12 09:48):
#paper https://doi.org/10.48550/arXiv.1806.08053 Semantic information, autonomous agency, and nonequilibrium statistical physics
论文尝试通过反事实对语义信息进行定义,通过个体跟环境的物理系,热力学的信息交换来实现,但后续工作不多,和自由能框架有些接近,
arXiv,
2018.
DOI: 10.48550/arXiv.1806.08053
Abstract:
Shannon information theory provides various measures of so-called "syntactic information", which reflect the amount of statistical correlation between systems. In contrast, the concept of "semantic information" refers to those correlations …
>>>
Shannon information theory provides various measures of so-called "syntactic information", which reflect the amount of statistical correlation between systems. In contrast, the concept of "semantic information" refers to those correlations which carry significance or "meaning" for a given system. Semantic information plays an important role in many fields, including biology, cognitive science, and philosophy, and there has been a long-standing interest in formulating a broadly applicable and formal theory of semantic information. In this paper we introduce such a theory. We define semantic information as the syntactic information that a physical system has about its environment which is causally necessary for the system to maintain its own existence. "Causal necessity" is defined in terms of counter-factual interventions which scramble correlations between the system and its environment, while "maintaining existence" is defined in terms of the system's ability to keep itself in a low entropy state. We also use recent results in nonequilibrium statistical physics to analyze semantic information from a thermodynamic point of view. Our framework is grounded in the intrinsic dynamics of a system coupled to an environment, and is applicable to any physical system, living or otherwise. It leads to formal definitions of several concepts that have been intuitively understood to be related to semantic information, including "value of information", "semantic content", and "agency".
<<<
翻译
88.
尹志
(2023-02-28 21:51):
#paper https://doi.org/10.48550/arXiv.2203.17003 ICML, 2022, Equivariant Diffusion for Molecule Generation in 3D。扩散模型在各个领域发展极其迅速。除了图形图像,其触角已经扩展到生物制药、材料科学领域。本文就是一篇使用扩散模型进行3D分子生成的文章。作者提出了一种等变扩散模型,其中的等变网络能够很好的同时处理原子坐标这样的连续变量和原子类型这样的离散变量。该工作在QM9和GEOM两个典型的数据集上取得了sota的性能,是将等变性引入扩散模型的开篇工作之一。
arXiv,
2022.
DOI: 10.48550/arXiv.2203.17003
Abstract:
This work introduces a diffusion model for molecule generation in 3D that is equivariant to Euclidean transformations. Our E(3) Equivariant Diffusion Model (EDM) learns to denoise a diffusion process with …
>>>
This work introduces a diffusion model for molecule generation in 3D that is equivariant to Euclidean transformations. Our E(3) Equivariant Diffusion Model (EDM) learns to denoise a diffusion process with an equivariant network that jointly operates on both continuous (atom coordinates) and categorical features (atom types). In addition, we provide a probabilistic analysis which admits likelihood computation of molecules using our model. Experimentally, the proposed method significantly outperforms previous 3D molecular generative methods regarding the quality of generated samples and efficiency at training time.
<<<
翻译
89.
张德祥
(2023-02-10 20:03):
#paper https://doi.org/10.48550/arXiv.2210.15889 Towards Data-and Knowledge-Driven Artificial Intelligence: A Survey on Neuro-Symbolic Computing 神经符号计算 (NeSy) 追求认知的符号和统计范式的整合,多年来一直是人工智能 (AI) 的活跃研究领域。由于 NeSy 有望调和符号表示的推理和可解释性优势以及神经网络中的稳健学习,它可能会成为下一代 AI 的催化剂。在本文中,我们系统地概述了 NeSy AI 研究的重要和最新进展。首先,我们介绍了这一领域的研究历史,涵盖了早期的工作和基础。我们进一步讨论背景概念并确定 NeSy 发展背后的关键驱动因素。之后,我们根据强调该研究范式的几个主要特征对最近具有里程碑意义的方法进行了分类,包括神经符号整合、知识表示、知识嵌入和功能。然后,我们简要讨论现代 NeSy 方法在几个领域的成功应用。最后,我们确定了未解决的问题以及潜在的未来研究方向。这项调查有望帮助新的研究人员进入这个快速发展的领域,并加速向数据和知识驱动的 AI 迈进。
arXiv,
2022.
DOI: 10.48550/arXiv.2210.15889
Abstract:
Neural-symbolic computing (NeSy), which pursues the integration of the symbolic and statistical paradigms of cognition, has been an active research area of Artificial Intelligence (AI) for many years. As NeSy …
>>>
Neural-symbolic computing (NeSy), which pursues the integration of the symbolic and statistical paradigms of cognition, has been an active research area of Artificial Intelligence (AI) for many years. As NeSy shows promise of reconciling the advantages of reasoning and interpretability of symbolic representation and robust learning in neural networks, it may serve as a catalyst for the next generation of AI. In the present paper, we provide a systematic overview of the important and recent developments of research on NeSy AI. Firstly, we introduce study history of this area, covering early work and foundations. We further discuss background concepts and identify key driving factors behind the development of NeSy. Afterward, we categorize recent landmark approaches along several main characteristics that underline this research paradigm, including neural-symbolic integration, knowledge representation, knowledge embedding, and functionality. Then, we briefly discuss the successful application of modern NeSy approaches in several domains. Finally, we identify the open problems together with potential future research directions. This survey is expected to help new researchers enter this rapidly-developing field and accelerate progress towards data-and knowledge-driven AI.
<<<
翻译
90.
王昊
(2023-01-31 23:53):
#paper Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules http://arxiv.org/abs/2001.01568 Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. 2020. Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules. Retrieved January 31, 2023. VCM图像编码基线方法(cheng2020网络),用于机器视觉编码的特征提取阶段,是图像压缩方法类算法。作者提出使用离散的高斯混合似然来参数化潜在表示的分布,可以获得更准确和灵活的概率模型。此外,作者还使用attention module来提高网络对图像中复杂区域的关注能力。具体地,作者提出使用离散高斯混合模型来对latent representation进行熵估计,这样可以对y提供多个最可能的均值,而每一个mixture的方差可以更小,达到的效果是实现更准确的概率模型,节约编码y所需要的比特数。第二,作者还加入了简化版的attention modules,可以提高网络对于non-zero responses,即复杂区域的关注,同时不引入过多的训练复杂度。
arXiv,
2020.
Abstract:
Image compression is a fundamental research field and many well-known compression standards have been developed for many decades. Recently, learned compression methods exhibit a fast development trend with promising results. …
>>>
Image compression is a fundamental research field and many well-known compression standards have been developed for many decades. Recently, learned compression methods exhibit a fast development trend with promising results. However, there is still a performance gap between learned compression algorithms and reigning compression standards, especially in terms of widely used PSNR metric. In this paper, we explore the remaining redundancy of recent learned compression algorithms. We have found accurate entropy models for rate estimation largely affect the optimization of network parameters and thus affect the rate-distortion performance. Therefore, in this paper, we propose to use discretized Gaussian Mixture Likelihoods to parameterize the distributions of latent codes, which can achieve a more accurate and flexible entropy model. Besides, we take advantage of recent attention modules and incorporate them into network architecture to enhance the performance. Experimental results demonstrate our proposed method achieves a state-of-the-art performance compared to existing learned compression methods on both Kodak and high-resolution datasets. To our knowledge our approach is the first work to achieve comparable performance with latest compression standard Versatile Video Coding (VVC) regarding PSNR. More importantly, our approach generates more visually pleasant results when optimized by MS-SSIM. This project page is at this https URL this https URL
<<<
翻译
91.
前进
(2023-01-31 23:30):
#paper Rethinking 1x1 Convolutions: Can we train CNNs with Frozen Random Filters? arXiv:2301.11360
本文引入了一种新的卷积块,计算(冻结随机)滤波器的可学习线性组合(LC),并由此提出 LCResNets,还提出一种新的权重共享机制,可大幅减少权重的数量。在本文中,即使在仅随机初始化且从不更新空间滤波器的极端情况下,某些CNN架构也可以被训练以超过标准训练的精度。通过将逐点(1x1)卷积的概念重新解释为学习冻结(随机)空间滤波器的线性组合(LC)的算子,这种方法不仅可以在CIFAR和ImageNet上达到较高的测试精度,而且在模型鲁棒性、泛化、稀疏
性和所需权重的总数方面具有良好。此外本文提出了一种新的权重共享机制,该机制允许在所有空间卷积层之间共享单个权重张量,以大幅减少权重的数量。
arXiv,
2023.
Abstract:
Modern CNNs are learning the weights of vast numbers of convolutional operators. In this paper, we raise the fundamental question if this is actually necessary. We show that even in …
>>>
Modern CNNs are learning the weights of vast numbers of convolutional operators. In this paper, we raise the fundamental question if this is actually necessary. We show that even in the extreme case of only randomly initializing and never updating spatial filters, certain CNN architectures can be trained to surpass the accuracy of standard training. By reinterpreting the notion of pointwise (1×1) convolutions as an operator to learn linear combinations (LC) of frozen (random) spatial filters, we are able to analyze these effects and propose a generic LC convolution block that allows tuning of the linear combination rate. Empirically, we show that this approach not only allows us to reach high test accuracies on CIFAR and ImageNet but also has favorable properties regarding model robustness, generalization, sparsity, and the total number of necessary weights. Additionally, we propose a novel weight sharing mechanism, which allows sharing of a single weight tensor between all spatial convolution layers to massively reduce the number of weights.
<<<
翻译
92.
尹志
(2023-01-31 20:59):
#paper Diffusion Models: A Comprehensive Survey of Methods and Applications, https://doi.org/10.48550/arXiv.2209.00796. 这篇综述对当前非常热门的扩散模型进行了详细的介绍与梳理。文章将当前的扩散模型总结为三类主要模型:DDPMs、SGMs、score SDEs,三类模型逐级一般化,可处理更广泛的问题。除了对三类主流扩散模型进行了详细的讲解,对比,对其相关改进工作进行了梳理,文章还探讨了扩散模型与其它主流的生成模型的联系与区别。文章在最后列举了扩散模型目前在各个领域的应用。考虑到扩散模型受物理概念启发,非常看好其后续结合数学物理的更多推广和应用,比如最近顾险峰老师就在文章中指出基于最优传输的可能改进,这确实是非常有意思的想法和主题。
arXiv,
2022.
DOI: 10.48550/arXiv.2209.00796
Abstract:
Diffusion models have emerged as a powerful new family of deep generative models with record-breaking performance in many applications, including image synthesis, video generation, and molecule design. In this survey, …
>>>
Diffusion models have emerged as a powerful new family of deep generative models with record-breaking performance in many applications, including image synthesis, video generation, and molecule design. In this survey, we provide an overview of the rapidly expanding body of work on diffusion models, categorizing the research into three key areas: efficient sampling, improved likelihood estimation, and handling data with special structures. We also discuss the potential for combining diffusion models with other generative models for enhanced results. We further review the wide-ranging applications of diffusion models in fields spanning from computer vision, natural language processing, temporal data modeling, to interdisciplinary applications in other scientific disciplines. This survey aims to provide a contextualized, in-depth look at the state of diffusion models, identifying the key areas of focus and pointing to potential areas for further exploration. Github: this https URL.
<<<
翻译
93.
张浩彬
(2023-01-30 13:34):
#paper https://doi.org/10.48550/arXiv.2202.01575 COST: CONTRASTIVE LEARNING OF DISENTANGLED SEASONAL-TREND REPRESENTATIONS FOR TIME SERIES FORECASTING
1. 文章认为一个时间序列可由3个部分组成,趋势项+季节项+误差项。我们需要学习的趋势项和季节项
2. 从整体结构上看,对于原始序列通过编码器(TCN)将原始序列映射到隐空间中,之后分别通过两个结构分理出趋势项及季节项分别进行对比学习
a. 对于趋势项来说,对于获得的隐空间表示,输入到自回归专家混合提取器中进行趋势提取,并通过时域进行对比损失学习。时域的对比损失学习参考了Moco进行
b. 对于季节项,用离散傅里叶变换将隐空间映射到频域,频域损失函数定义为波幅和相位的损失。
3. 最终总的损失函数时域+频域的损失函数
4. 基于5个数据和多个基线模型进行对比,包括TS2Vec、TNC,Moco,Informer、LogTrans、TCN等,大部分取得了SOTA的效果
arXiv,
2022.
DOI: 10.48550/arXiv.2202.01575
Abstract:
Deep learning has been actively studied for time series forecasting, and the mainstream paradigm is based on the end-to-end training of neural network architectures, ranging from classical LSTM/RNNs to more …
>>>
Deep learning has been actively studied for time series forecasting, and the mainstream paradigm is based on the end-to-end training of neural network architectures, ranging from classical LSTM/RNNs to more recent TCNs and Transformers. Motivated by the recent success of representation learning in computer vision and natural language processing, we argue that a more promising paradigm for time series forecasting, is to first learn disentangled feature representations, followed by a simple regression fine-tuning step -- we justify such a paradigm from a causal perspective. Following this principle, we propose a new time series representation learning framework for time series forecasting named CoST, which applies contrastive learning methods to learn disentangled seasonal-trend representations. CoST comprises both time domain and frequency domain contrastive losses to learn discriminative trend and seasonal representations, respectively. Extensive experiments on real-world datasets show that CoST consistently outperforms the state-of-the-art methods by a considerable margin, achieving a 21.3% improvement in MSE on multivariate benchmarks. It is also robust to various choices of backbone encoders, as well as downstream regressors. Code is available at this https URL.
<<<
翻译
94.
林海onrush
(2023-01-27 01:30):
#paper, Twist: Sound Reasoning for Purity and Entanglement in Quantum Programs,DOI:
10.48550/arXiv.2205.02287,作者引入了纯度表达式的概念,以在量子程序中对纠缠状态进行推理判断。类似于经典内存的指针,并通过执行被称为门的操作来对它们进行评估。由于纠缠的特殊形式存在,导致量子比特的测量结果是相关的现象,而纠缠可以决定算法的正确性和编程模式的适用性。将纯度表达形式化,可以作为自动推理量子程序中纠缠的核心工具,是指其评价不受量子比特的测量结果影响的表达式。本文主要贡献在于提出了Twist,这是第一种具有类型系统的语言,用于对纯度进行合理推理,使开发者能够使用类型注解来识别纯度表达式。最后证明了Twist可以表达量子算法,捕捉其中的编程错误,并支持一些其他语言不允许的程序。同时产生的运行时验证开销小于3.5%。整体而言,是一项基础且有意义的工作。
arXiv,
2022.
DOI: 10.48550/arXiv.2205.02287
Abstract:
Quantum programming languages enable developers to implement algorithms for quantum computers that promise computational breakthroughs in classically intractable tasks. Programming quantum computers requires awareness of entanglement, the phenomenon in which …
>>>
Quantum programming languages enable developers to implement algorithms for quantum computers that promise computational breakthroughs in classically intractable tasks. Programming quantum computers requires awareness of entanglement, the phenomenon in which measurement outcomes of qubits are correlated. Entanglement can determine the correctness of algorithms and suitability of programming patterns. In this work, we formalize purity as a central tool for automating reasoning about entanglement in quantum programs. A pure expression is one whose evaluation is unaffected by the measurement outcomes of qubits that it does not own, implying freedom from entanglement with any other expression in the computation. We present Twist, the first language that features a type system for sound reasoning about purity. The type system enables the developer to identify pure expressions using type annotations. Twist also features purity assertion operators that state the absence of entanglement in the output of quantum gates. To soundly check these assertions, Twist uses a combination of static analysis and runtime verification. We evaluate Twist's type system and analyses on a benchmark suite of quantum programs in simulation, demonstrating that Twist can express quantum algorithms, catch programming errors in them, and support programs that several languages disallow, while incurring runtime verification overhead of less than 3.5%.
<<<
翻译
95.
张德祥
(2023-01-06 18:42):
#paper
https://doi.org/10.48550/arXiv.2212.12393
A-NeSI: A Scalable Approximate Method for Probabilistic Neurosymbolic Inference
这篇论文受GFlownet启发,首次在MNIST ADD的训练上达到了 15位数的加法训练,人造算数天才指日可待。结合神经网络和符号计算 。
arXiv,
2022.
DOI: 10.48550/arXiv.2212.12393
Abstract:
We study the problem of combining neural networks with symbolic reasoning. Recently introduced frameworks for Probabilistic Neurosymbolic Learning (PNL), such as DeepProbLog, perform exponential-time exact inference, limiting the scalability of …
>>>
We study the problem of combining neural networks with symbolic reasoning. Recently introduced frameworks for Probabilistic Neurosymbolic Learning (PNL), such as DeepProbLog, perform exponential-time exact inference, limiting the scalability of PNL solutions. We introduce Approximate Neurosymbolic Inference (A-NeSI): a new framework for PNL that uses neural networks for scalable approximate inference. A-NeSI 1) performs approximate inference in polynomial time without changing the semantics of probabilistic logics; 2) is trained using data generated by the background knowledge; 3) can generate symbolic explanations of predictions; and 4) can guarantee the satisfaction of logical constraints at test time, which is vital in safety-critical applications. Our experiments show that A-NeSI is the first end-to-end method to scale the Multi-digit MNISTAdd benchmark to sums of 15 MNIST digits, up from 4 in competing systems. Finally, our experiments show that A-NeSI achieves explainability and safety without a penalty in performance.
<<<
翻译
96.
王昊
(2022-12-31 23:57):
#paper https://arxiv.org/abs/2111.08687v2 Jing Shao, Siyu Chen, Yangguang Li, et al. 2021. INTERN: A New Learning Paradigm Towards General Vision.
视觉基础模型的论文。“书生”(INTERN),旨在系统化解决当下人工智能视觉领域中存在的任务通用、场景泛化和数据效率等一系列瓶颈问题。“书生”由七大模块组成,包括通用视觉数据系统、通用视觉网络结构、通用视觉评测基准三个基础设施模块,以及区分上下游的四个训练阶段模块。多个阶段中学习到了很强的泛化能力。其可以在26个数据集上实现CV中的四类任务,仅使用10%的训练数据进行微调,性能便优于全套数据训练的对应模型。
arXiv,
2021.
DOI: 10.48550/arXiv.2111.08687
Abstract:
Enormous waves of technological innovations over the past several years, marked by the advances in AI technologies, are profoundly reshaping the industry and the society. However, down the road, a …
>>>
Enormous waves of technological innovations over the past several years, marked by the advances in AI technologies, are profoundly reshaping the industry and the society. However, down the road, a key challenge awaits us, that is, our capability of meeting rapidly-growing scenario-specific demands is severely limited by the cost of acquiring a commensurate amount of training data. This difficult situation is in essence due to limitations of the mainstream learning paradigm: we need to train a new model for each new scenario, based on a large quantity of well-annotated data and commonly from scratch. In tackling this fundamental problem, we move beyond and develop a new learning paradigm named INTERN. By learning with supervisory signals from multiple sources in multiple stages, the model being trained will develop strong generalizability. We evaluate our model on 26 well-known datasets that cover four categories of tasks in computer vision. In most cases, our models, adapted with only 10% of the training data in the target domain, outperform the counterparts trained with the full set of data, often by a significant margin. This is an important step towards a promising prospect where such a model with general vision capability can dramatically reduce our reliance on data, thus expediting the adoption of AI technologies. Furthermore, revolving around our new paradigm, we also introduce a new data system, a new architecture, and a new benchmark, which, together, form a general vision ecosystem to support its future development in an open and inclusive manner. See project website at this https URL .
<<<
翻译
97.
尹志
(2022-12-31 14:48):
#paper doi: https://doi.org/10.48550/arXiv.2210.11250,Structure-based drug design with geometric deep learning.
这是一篇比较新的关于药物设计和深度学习的短小的综述。主要探讨了在结构化药物设计领域的若干重要子任务上,几何深度学习技术是如何发挥其作用的。考虑到结构化药物设计主要使用大分子(比如蛋白质、核酸)的三维几何信息来识别合适的配体,几何深度学习作为一种将几何对称性引入深度学习的技术是非常有潜力的工具。文章主要探讨了1)分子性质预测(结合亲和度、蛋白质功能、位置分数);2)结合位点和结合面预测(小分子结合位点和蛋白-蛋白结合面);3)结合位置生成和分子对接(配体-蛋白和蛋白-蛋白对接);4)基于结构的小分子配体de novo 设计几个子任务。从分子的常见表征谈起,再讨论结构化药物设计中存在的对称性问题,然后通过四个小节,分别讨论了几何深度学习对四个子任务的研究现状。是基于AI的结构化药物设计领域的一篇很不错的guideline。
arXiv,
2022.
DOI: 10.48550/arXiv.2210.11250
Abstract:
Structure-based drug design uses three-dimensional geometric information of macromolecules, such as proteins or nucleic acids, to identify suitable ligands. Geometric deep learning, an emerging concept of neural-network-based machine learning, has …
>>>
Structure-based drug design uses three-dimensional geometric information of macromolecules, such as proteins or nucleic acids, to identify suitable ligands. Geometric deep learning, an emerging concept of neural-network-based machine learning, has been applied to macromolecular structures. This review provides an overview of the recent applications of geometric deep learning in bioorganic and medicinal chemistry, highlighting its potential for structure-based drug discovery and design. Emphasis is placed on molecular property prediction, ligand binding site and pose prediction, and structure-based de novo molecular design. The current challenges and opportunities are highlighted, and a forecast of the future of geometric deep learning for drug discovery is presented.
<<<
翻译
98.
前进
(2022-12-31 11:39):
#paper Liu Y, Chen J, Wei S, et al. On Finite Difference Jacobian Computation in Deformable Image Registration[J]. arXiv preprint arXiv:2212.06060, 2022.
产生微分同胚的空间变换一直是变形图像配准的中心问题。作为一个微分同胚变换,应在任何位置都具有正的雅可比行列式|J|。|J|<0的体素数已被用于测试微分同胚性,也用于测量变换的不规则性。
对于数字变换,|J|通常使用中心差来近似,但是对于即使在体素分辨率级别上也明显不具有差分同胚性的变换,这种策略可以产生正的|J|。为了证明这一点,论文首先研究了|J|的不同有限差分近似的几何意义。为了确定数字图像的微分同胚性,使用任何单独的有限差分逼近|J|是不够的。论文证明对于2D变换,|J|的四个唯一的有限差分近似必须是正的,以确保整个域是可逆的,并且在像素级没有折叠。在3D中,|J|的十个唯一的有限差分近似值需要是正的。论文提出的数字微分同胚准则解决了|J|的中心差分近似中固有的几个误差,并准确地检测非微分同胚数字变换。
arXiv,
2022.
DOI: 10.48550/arXiv.2212.06060
Abstract:
Producing spatial transformations that are diffeomorphic has been a central problem in deformable image registration. As a diffeomorphic transformation should have positive Jacobian determinant |J| everywhere, the number of voxels …
>>>
Producing spatial transformations that are diffeomorphic has been a central problem in deformable image registration. As a diffeomorphic transformation should have positive Jacobian determinant |J| everywhere, the number of voxels with |J|<0 has been used to test for diffeomorphism and also to measure the irregularity of the transformation. For digital transformations, |J| is commonly approximated using central difference, but this strategy can yield positive |J|'s for transformations that are clearly not diffeomorphic -- even at the voxel resolution level. To show this, we first investigate the geometric meaning of different finite difference approximations of |J|. We show that to determine diffeomorphism for digital images, use of any individual finite difference approximations of |J| is insufficient. We show that for a 2D transformation, four unique finite difference approximations of |J|'s must be positive to ensure the entire domain is invertible and free of folding at the pixel level. We also show that in 3D, ten unique finite differences approximations of |J|'s are required to be positive. Our proposed digital diffeomorphism criteria solves several errors inherent in the central difference approximation of |J| and accurately detects non-diffeomorphic digital transformations.
<<<
翻译
99.
林海onrush
(2022-11-30 21:51):
#paper,https://doi.org/10.48550/arXiv.2211.16197,FJMP: Factorized Joint Multi-Agent Motion Prediction over Learned Directed Acyclic Interaction Graphs,该研究针对自动驾驶轨迹预测生成问题,提出了FJMP,一种学习有向无环相互作用图的因子分解多智能体联合运动预测框架.使用未来场景交互动力学作为稀疏有向交互图,边缘表示agent之间的显式交互,修剪图成有向无环图(DAG)并分解联合预测任务,根据 DAG 的部分排序,其中联合未来轨迹使用有向无环图神经网络DAGNN。在INTERACTION和Argoverse2数据集上,证明了FJMP与非因子化相比能得到准确且场景一致的联合轨迹预测。FJMP在交互的多智能体INTERACTION基准测试上取得SOTA。
arXiv,
2022.
DOI: 10.48550/arXiv.2211.16197
Abstract:
Predicting the future motion of road agents is a critical task in an autonomous driving pipeline. In this work, we address the problem of generating a set of scene-level, or …
>>>
Predicting the future motion of road agents is a critical task in an autonomous driving pipeline. In this work, we address the problem of generating a set of scene-level, or joint, future trajectory predictions in multi-agent driving scenarios. To this end, we propose FJMP, a Factorized Joint Motion Prediction framework for multi-agent interactive driving scenarios. FJMP models the future scene interaction dynamics as a sparse directed interaction graph, where edges denote explicit interactions between agents. We then prune the graph into a directed acyclic graph (DAG) and decompose the joint prediction task into a sequence of marginal and conditional predictions according to the partial ordering of the DAG, where joint future trajectories are decoded using a directed acyclic graph neural network (DAGNN). We conduct experiments on the INTERACTION and Argoverse 2 datasets and demonstrate that FJMP produces more accurate and scene-consistent joint trajectory predictions than non-factorized approaches, especially on the most interactive and kinematically interesting agents. FJMP ranks 1st on the multi-agent test leaderboard of the INTERACTION dataset.
<<<
翻译
100.
张德祥
(2022-11-16 09:16):
#paper https://doi.org/10.48550/arXiv.2206.02063 Active Bayesian Causal Inference :We sequentially design experiments that are maximally informative about our target causal query, collect the corresponding interventional data, and update our beliefs to choose the next experiment;
目前的工作中,我们考虑了更一般的设置,其中我们有兴趣进行因果推理,但没有获得参考因果模型的先验。
在这种情况下,因果发现可以被视为达到目的的手段,而不是主要目标。由于两个原因, 专注于主动学习完整的因果模型以实现随后的因果推理可能是不利的。首先,如果我们只对因果模型的特定方面感兴趣,那么浪费样本来学习完整的因果图是次优的。其次,从少量数据中发现因果关系会带来显著的认知不确定性;
我们提出了主动贝叶斯因果推理(ABCI),这是一个完全贝叶斯框架,用于整合因果发现和推理与实验设计。基本方法是将贝叶斯先验置于选择的因果模型类之上, 并将学习问题作为贝叶斯推理置于模型后验之上。给定未观察的因果模型,我们通过引入目标因果查询来形式化因果推理 ;
我们遵循贝叶斯最优实验设计方法[10,42]然后根据我们当前的信念,在真正的因果模型上选择最能提供我们目标查询信息的可接受的干预。给定观察到的数据,我们然后通过计算因果模型和查询的后验来更新我们的信念,并使用它们来设计下一个实验。
arXiv,
2022.
DOI: 10.48550/arXiv.2206.02063
Abstract:
Causal discovery and causal reasoning are classically treated as separate and consecutive tasks: one first infers the causal graph, and then uses it to estimate causal effects of interventions. However, …
>>>
Causal discovery and causal reasoning are classically treated as separate and consecutive tasks: one first infers the causal graph, and then uses it to estimate causal effects of interventions. However, such a two-stage approach is uneconomical, especially in terms of actively collected interventional data, since the causal query of interest may not require a fully-specified causal model. From a Bayesian perspective, it is also unnatural, since a causal query (e.g., the causal graph or some causal effect) can be viewed as a latent quantity subject to posterior inference -- other unobserved quantities that are not of direct interest (e.g., the full causal model) ought to be marginalized out in this process and contribute to our epistemic uncertainty. In this work, we propose Active Bayesian Causal Inference (ABCI), a fully-Bayesian active learning framework for integrated causal discovery and reasoning, which jointly infers a posterior over causal models and queries of interest. In our approach to ABCI, we focus on the class of causally-sufficient, nonlinear additive noise models, which we model using Gaussian processes. We sequentially design experiments that are maximally informative about our target causal query, collect the corresponding interventional data, and update our beliefs to choose the next experiment. Through simulations, we demonstrate that our approach is more data-efficient than several baselines that only focus on learning the full causal graph. This allows us to accurately learn downstream causal queries from fewer samples while providing well-calibrated uncertainty estimates for the quantities of interest.
<<<
翻译