尹志 (2023-12-31 14:32):
#paper Consistency Models https://doi.org/10.48550/arXiv.2303.01469 扩散模型目前已经是生成式AI的核心技术方案了,但是由于它的迭代生成的性质,使得采样速度一直存在问题,因此在实际应用的场景下就会遇到阻碍。CM(consistency models)作为常规的扩散模型的高效改进方案,基于PE(probability flow) ODE轨道,提出一个针对ODE轨道(可以认为是演化迭代的步骤)上的映射,使得我们能够从任意轨道点,即任意迭代的timestep,映射到初始点,即原图。cm模型的提出,让单步扩散模型采样的质量变得更高,从而带动了大量实际应用的产生,包括图像编辑、图像补全等。目前大量基于扩散模型的实际应用,都已经使用了cm。这个是年初的时候Yang Song大佬和Ilya Sutskever一起的工作,四个作者全部都是来自openAI的扩散模型大佬。
Consistency Models
翻译
Abstract:
Diffusion models have significantly advanced the fields of image, audio, andvideo generation, but they depend on an iterative sampling process that causesslow generation. To overcome this limitation, we propose consistency models, anew family of models that generate high quality samples by directly mappingnoise to data. They support fast one-step generation by design, while stillallowing multistep sampling to trade compute for sample quality. They alsosupport zero-shot data editing, such as image inpainting, colorization, andsuper-resolution, without requiring explicit training on these tasks.Consistency models can be trained either by distilling pre-trained diffusionmodels, or as standalone generative models altogether. Through extensiveexperiments, we demonstrate that they outperform existing distillationtechniques for diffusion models in one- and few-step sampling, achieving thenew state-of-the-art FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 forone-step generation. When trained in isolation, consistency models become a newfamily of generative models that can outperform existing one-step,non-adversarial generative models on standard benchmarks such as CIFAR-10,ImageNet 64x64 and LSUN 256x256.
翻译
回到顶部