文献收藏与分享平台

尹志 (2022-05-30 13:31):

#paper https://doi.org/10.48550/arXiv.1907.05600 Generative Modeling by Estimating Gradients of the Data Distribution NeurIPS 2019 (Oral) (2019). 继续生成模型啊，这篇文章作者提出了一种基于评分的生成模型。我们知道现在主流的生成模型基本可以分为likelihood-based和类似GAN那样通过对抗而不计算具体的概率密度函数的隐式模型。前者的代表如VAE、normalizing flow等。而本文的模型也属于这个范畴。在这类模型中，由于需要对条件概率进行积分，归一化常数Z的计算非常困难，因此派生出各类解决方法。本文其核心思想是通过对概率密度的梯度进行建模估计（准确来说是对log概率密度函数）。这里的log概率密度函数的梯度被定义为score function，而作者也是通过评分匹配(score matching)进行估计的。在生成模型建立之后，进而通过Langevin动力学进行采样，即生成样本。部分细节还在推，代码也在复现中，感觉是一类比较有效的生成模型，生成图片的质量较高，改进版本已经可以和GAN的生成质量一较高下。但目前最大的问题是废卡，非常废卡，希望后面自己可以在如何提高其训练效率及抽样效率上做一些工作。

arXiv, 2019. DOI: 10.48550/arXiv.1907.05600

Generative Modeling by Estimating Gradients of the Data Distribution

翻译

Yang Song, Stefano Ermon

Abstract:

We introduce a new generative model where samples are produced via Langevin dynamics using gradients of the data distribution estimated with score matching. Because gradients can be ill-defined and hard to estimate when the data resides on low-dimensional manifolds, we perturb the data with different levels of Gaussian noise, and jointly estimate the corresponding scores, i.e., the vector fields of gradients of the perturbed data distribution for all noise levels. For sampling, we propose an annealed Langevin dynamics where we use gradients corresponding to gradually decreasing noise levels as the sampling process gets closer to the data manifold. Our framework allows flexible model architectures, requires no sampling during training or the use of adversarial methods, and provides a learning objective that can be used for principled model comparisons. Our models produce samples comparable to GANs on MNIST, CelebA and CIFAR-10 datasets, achieving a new state-of-the-art inception score of 8.87 on CIFAR-10. Additionally, we demonstrate that our models learn effective representations via image inpainting experiments.

翻译