响马读paper

一个要求成员每月至少读一篇文献并打卡的学术交流社群

2019, arXiv. DOI: 10.48550/arXiv.1902.04601 arXiv ID: 1902.04601
Contrastive Variational Autoencoder Enhances Salient Features
Abubakar Abid, James Zou
Abstract:
Variational autoencoders are powerful algorithms for identifying dominant
latent structure in a single dataset. In many applications, however, we are
interested in modeling latent structure and variation that are enriched in a
target dataset compared to some background---e.g. enriched in patients compared
to the general population. Contrastive learning is a principled framework to
capture such enriched variation between the target and background, but
state-of-the-art contrastive methods are limited to linear models. In this
paper, we introduce the contrastive variational autoencoder (cVAE), which
combines the benefits of contrastive learning with the power of deep generative
models. The cVAE is designed to identify and enhance salient latent features.
The cVAE is trained on two related but unpaired datasets, one of which has
minimal contribution from the salient latent features. The cVAE explicitly
models latent features that are shared between the datasets, as well as those
that are enriched in one dataset relative to the other, which allows the
algorithm to isolate and enhance the salient latent features. The algorithm is
straightforward to implement, has a similar run-time to the standard VAE, and
is robust to noise and dataset purity. We conduct experiments across diverse
types of data, including gene expression and facial images, showing that the
cVAE effectively uncovers latent structure that is salient in a particular
analysis.
2023-11-30 16:34:00
#paper Contrastive Variational Autoencoder Enhances Salient Features, arxiv, 2019 https://arxiv.org/abs/1902.04601 最近的对比PCA采用了对比学习的思路,能够捕捉目标数据集与背景之间的差异,从而实现保留对比信号的无监督降维。然而对比PCA跟PCA类似,只能对变量做线性组合进行降维,无法捕捉变量间的非线性关系。这篇文章对对比PCA做了拓展,使用变分自编码模型(VAE)来实现对非线性关系的捕捉,该方法称为对比VAE。对比VAE通过对数据集间的共享特征以及富集在目标数据中的特征进行显式建模,从而分离和增强目标数据中的突出潜在特征。该方法的运算时间与VAE类似,并且对噪音和数据纯度有较高的鲁棒性。文章在多个数据集上(例如手写数字MNIST)验证了该方法在捕捉突出潜在特征方面的有效性,比起传统的VAE也有持续提高。同时其作为一种生成式学习工具,训练好以后也能够用这些显著潜在特征来生成新的数据。
TOP