文献收藏与分享平台

🐼太真实 (2023-12-28 20:39):

#paper https://doi.org/10.48550/arXiv.2312.03701 , Self-conditioned Image Generation via Generating Representations 这篇文章介绍了一种名为“表示条件图像生成”（RCG）的新型图像生成框架。RCG 不依赖于人类标注，而是基于自监督的表示分布来生成图像。使用预训练的编码器将图像分布映射到表示分布，然后通过表示扩散模型（RDM）从中采样，最后通过像素生成器根据采样的表示生成图像。RCG 在 ImageNet 256×256 数据集上实现了显著的性能提升，其 FID 和 IS 分别达到了 3.31 和 253.4。这个方法不仅显著提升了类无条件图像生成的水平，而且与当前领先的类条件图像生成方法相比也具有竞争力，弥补了这两种任务之间长期存在的性能差距。

arXiv, 2023. DOI: 10.48550/arXiv.2312.03701

Self-conditioned Image Generation via Generating Representations

翻译

Tianhong Li, Dina Katabi, Kaiming He

Abstract:

This paper presents $\textbf{R}$epresentation-$\textbf{C}$onditioned image$\textbf{G}$eneration (RCG), a simple yet effective image generation frameworkwhich sets a new benchmark in class-unconditional image generation. RCG doesnot condition on any human annotations. Instead, it conditions on aself-supervised representation distribution which is mapped from the imagedistribution using a pre-trained encoder. During generation, RCG samples fromsuch representation distribution using a representation diffusion model (RDM),and employs a pixel generator to craft image pixels conditioned on the sampledrepresentation. Such a design provides substantial guidance during thegenerative process, resulting in high-quality image generation. Tested onImageNet 256$\times$256, RCG achieves a Frechet Inception Distance (FID) of3.31 and an Inception Score (IS) of 253.4. These results not only significantlyimprove the state-of-the-art of class-unconditional image generation but alsorival the current leading methods in class-conditional image generation,bridging the long-standing performance gap between these two tasks. Code isavailable at https://github.com/LTH14/rcg.

翻译

Related Links:

http://arxiv.org/abs/2312.03701v2