来自用户 尹志 的文献。
当前共找到 46 篇文献分享,本页显示第 41 - 46 篇。
41.
尹志
(2022-05-30 13:31):
#paper https://doi.org/10.48550/arXiv.1907.05600 Generative Modeling by Estimating Gradients of the Data Distribution NeurIPS 2019 (Oral) (2019). 继续生成模型啊,这篇文章作者提出了一种基于评分的生成模型。我们知道现在主流的生成模型基本可以分为likelihood-based和类似GAN那样通过对抗而不计算具体的概率密度函数的隐式模型。前者的代表如VAE、normalizing flow等。而本文的模型也属于这个范畴。在这类模型中,由于需要对条件概率进行积分,归一化常数Z的计算非常困难,因此派生出各类解决方法。本文其核心思想是通过对概率密度的梯度进行建模估计(准确来说是对log概率密度函数)。这里的log概率密度函数的梯度被定义为score function,而作者也是通过评分匹配(score matching)进行估计的。在生成模型建立之后,进而通过Langevin动力学进行采样,即生成样本。部分细节还在推,代码也在复现中,感觉是一类比较有效的生成模型,生成图片的质量较高,改进版本已经可以和GAN的生成质量一较高下。但目前最大的问题是废卡,非常废卡,希望后面自己可以在如何提高其训练效率及抽样效率上做一些工作。
arXiv,
2019.
DOI: 10.48550/arXiv.1907.05600
Abstract:
We introduce a new generative model where samples are produced via Langevin dynamics using gradients of the data distribution estimated with score matching. Because gradients can be ill-defined and hard …
>>>
We introduce a new generative model where samples are produced via Langevin dynamics using gradients of the data distribution estimated with score matching. Because gradients can be ill-defined and hard to estimate when the data resides on low-dimensional manifolds, we perturb the data with different levels of Gaussian noise, and jointly estimate the corresponding scores, i.e., the vector fields of gradients of the perturbed data distribution for all noise levels. For sampling, we propose an annealed Langevin dynamics where we use gradients corresponding to gradually decreasing noise levels as the sampling process gets closer to the data manifold. Our framework allows flexible model architectures, requires no sampling during training or the use of adversarial methods, and provides a learning objective that can be used for principled model comparisons. Our models produce samples comparable to GANs on MNIST, CelebA and CIFAR-10 datasets, achieving a new state-of-the-art inception score of 8.87 on CIFAR-10. Additionally, we demonstrate that our models learn effective representations via image inpainting experiments.
<<<
翻译
42.
尹志
(2022-04-28 22:10):
#paper https://doi.org/10.48550/arXiv.1503.03585 Deep Unsupervised Learning using Nonequilibrium Thermodynamics ICML (2015). 这是一篇还没完全看懂的论文,但是非常有意思。说起这篇文章的扩散模型大家一不定熟悉,但是提到最近大火的openai的工作dall-e 2,大家可能会更熟悉一点。对,Dall-E 2最早的启发就是这篇文章。本文受非平衡热力学的启发,设计了一个称之为扩散模型(diffusion model)的生成模型。我们知道,在机器学习中,对一堆数据的分布进行估计是一个极具挑战的事情。特别是要兼顾模型的灵活性(flexible)和过程的可解性(tractable)。如果把建模隐变量z到观测量x的映射作为任务,那么扩散模型的想法是,
假设整个映射是一个马尔科夫链(MC),然后数据的初始状态是由一步步不断添加高斯噪声,最终获得某种最终形态,那么反过来,可以将去噪的过程看做是生成的过程。我们针对这个MC过程进行训练,那么逆过程则可以作为生成模型生成符合分布的数据。是的,很像VAE。考虑到这类生成模型通过不断的改进,已经达到Dall-E 2的效果,值得我们深入理解背后的机制,以及是否可以在数据合成上产生更好的效果。
arXiv,
2015.
DOI: 10.48550/arXiv.1503.03585
Abstract:
A central problem in machine learning involves modeling complex data-sets using highly flexible families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractable. …
>>>
A central problem in machine learning involves modeling complex data-sets using highly flexible families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractable. Here, we develop an approach that simultaneously achieves both flexibility and tractability. The essential idea, inspired by non-equilibrium statistical physics, is to systematically and slowly destroy structure in a data distribution through an iterative forward diffusion process. We then learn a reverse diffusion process that restores structure in data, yielding a highly flexible and tractable generative model of the data. This approach allows us to rapidly learn, sample from, and evaluate probabilities in deep generative models with thousands of layers or time steps, as well as to compute conditional and posterior probabilities under the learned model. We additionally release an open source reference implementation of the algorithm.
<<<
翻译
43.
尹志
(2022-03-25 14:10):
#paper doi:10.1109/CVPR.2015.7298682, 2015, FaceNet: A unified embedding for face recognition and clustering. 这是一篇人脸检测领域的经典论文。Google写的,发在2015年的CVPR上。在LFW数据集上刷到99.63%的分数,在YouTube Faces DB上也刷到95.12%,当时的SOTA。虽然讲的是人脸检测,但其思想适合于非常多的场景,包括各类图像识别问题,自然语言处理问题等。文章引入了一套端到端的训练方式,直接对嵌入空间进行建模。其想法非常直接,即通过嵌入空间建模,将每张人脸映射到嵌入空间的一个点。在这样的嵌入下,相同id的人脸应该接近,而不同id的人脸应该远离,那么这样的嵌入方式,可以理解成一个特征处理器,从而对后续人脸检测、识别、聚类等动作做出高效的预先计算。网络结构部分比较简单,主要用的是当时还很新鲜的inception网络,有趣的是它的loss,文章引入了triplet loss的概念,即anchor-pos对,anchor-neg对进行距离计算。其中anchor为某id对应图片,pos为该id对应的其它人脸图片,neg为非该id的人脸图片。思想很简单,就是通过训练,让anchor-pos对的距离很小,anchor-neg对的距离很大。这里的loss在数学上,就表示为anchor-pos对的距离-anchor-neg对的距离+alpha。这里的alpha可以理解为一个约束,其将同一个id的脸约束在一个流形上且保度规。当然,在实践训练中,triplet的选择也很重要,有兴趣的可以看paper。虽然文章比较老,所用的网络结构也很老,但是其简单的思想,有效的结果都给后续的很多识别工作,不论是研究还是工业实战层面带来巨大的启发。比如做word2vec的小伙伴肯定会心有戚戚焉。
Abstract:
No abstract available.
44.
尹志
(2022-02-08 23:23):
#paper doi: 10.7554/eLife.58906 Anna A Ivanova, et al. Comprehension of computer code relies primarily on domain-general executive brain regions. eLife 2020;9:e58906(2020). 这是我在看一本编程小册子的时候作者引的一篇神经科学的研究工作。文章探讨了编程作为一项认知活动,到底是什么认知与神经机制在支撑它?研究者用fMRI技术对两类大脑系统进行了考察:1. multiple demand (MD) system;2. language system。 前者在数学、逻辑、解决问题中被常使用;后者在语言处理中被常使用。作者使用python和ScratchJr两种编程方式(基于文本的和基于图形界面的)进行编码和进行句子的内容匹配。他们发现MD系统在两种编程方式中,对编码活动都有强烈的反应;语言系统则只对句子的内容匹配有强烈的反应,对编码活动的反应很弱。当然这就一定程度上说明了编程活动是一项类似问题解决或者数学解题这样的认知活动。虽然编码很多时候是文字的形式,我们也习惯说编程语言,但处理它的大脑认知机制从实验上来看,似乎并不对应于常规的语言处理。
Abstract:
Computer programming is a novel cognitive tool that has transformed modern society. What cognitive and neural mechanisms support this skill? Here, we used functional magnetic resonance imaging to investigate two …
>>>
Computer programming is a novel cognitive tool that has transformed modern society. What cognitive and neural mechanisms support this skill? Here, we used functional magnetic resonance imaging to investigate two candidate brain systems: the multiple demand (MD) system, typically recruited during math, logic, problem solving, and executive tasks, and the language system, typically recruited during linguistic processing. We examined MD and language system responses to code written in Python, a text-based programming language (Experiment 1) and in ScratchJr, a graphical programming language (Experiment 2); for both, we contrasted responses to code problems with responses to content-matched sentence problems. We found that the MD system exhibited strong bilateral responses to code in both experiments, whereas the language system responded strongly to sentence problems, but weakly or not at all to code problems. Thus, the MD system supports the use of novel cognitive tools even when the input is structurally similar to natural language.
<<<
翻译
45.
尹志
(2022-01-31 12:53):
#paper doi:10.1038/nature14539 LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015). 这是深度学习三巨头于2015年写的一篇nature综述。也是nature纪念AI60周年的一系列综述paper里的一篇。这篇paper综述了深度学习这一热门主题。当然,作为深度学习的几位奠基人,确实把深度学习的概念原理应用写的深入浅出。本文从监督学习一直介绍到反向传播,主要综述了CNN和RNN的原理及其应用,很适合初学者全面了解(当时)的深度学习的概貌。在最后一段深度学习的未来一节,作者对无监督学习的未来报以热烈的期望,看看这几年,特别是yann lecun大力推动的自监督成为显学,也算是念念不忘必有回响了。
Abstract:
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in …
>>>
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
<<<
翻译
46.
尹志
(2022-01-18 23:37):
#paper doi:10.1038/s41416-020-01122-x Deep learning in cancer pathology: a new generation of clinical biomarkers. British Journal of Cancer, 2020 Nov 18. 这是一篇综述,综述了一下深度学习从病理图像直接抽取biomarker的相关概念,以及病理图像中利用深度学习做的各种基本的和进阶的图像分析任务。
我们知道,在肿瘤的临床治疗中会基于各种分子生物标记物。但这些分子标记物都比较耗时费力。而且一般而言,这些分子标记物都需要tumour tissue。 但其实tumour tissue上有很多信息我们现在都没好好利用。利用深度学习,我们可以直接从常规病理图像中提取更多信息。从而提供潜在的具有临床价值的信息。
里面介绍的基本任务包括:检测、评级、tumour tissue亚型预测。这些任务的目的是自动化病理诊断流程,但结论不形成直接的临床决策。(辅助诊断呗)。
进阶任务可直接影响临床决策:比如分子特性推断、生存率预测、端到端的疗效预测。这些任务都可以直接影响临床决策,但目前需要更好的临床验证。比如需要更多前瞻性实验的验证。(就是还不能用呗)。
Abstract:
Clinical workflows in oncology rely on predictive and prognostic molecular biomarkers. However, the growing number of these complex biomarkers tends to increase the cost and time for decision-making in routine …
>>>
Clinical workflows in oncology rely on predictive and prognostic molecular biomarkers. However, the growing number of these complex biomarkers tends to increase the cost and time for decision-making in routine daily oncology practice; furthermore, biomarkers often require tumour tissue on top of routine diagnostic material. Nevertheless, routinely available tumour tissue contains an abundance of clinically relevant information that is currently not fully exploited. Advances in deep learning (DL), an artificial intelligence (AI) technology, have enabled the extraction of previously hidden information directly from routine histology images of cancer, providing potentially clinically useful information. Here, we outline emerging concepts of how DL can extract biomarkers directly from histology images and summarise studies of basic and advanced image analysis for cancer histology. Basic image analysis tasks include detection, grading and subtyping of tumour tissue in histology images; they are aimed at automating pathology workflows and consequently do not immediately translate into clinical decisions. Exceeding such basic approaches, DL has also been used for advanced image analysis tasks, which have the potential of directly affecting clinical decision-making processes. These advanced approaches include inference of molecular features, prediction of survival and end-to-end prediction of therapy response. Predictions made by such DL systems could simplify and enrich clinical decision-making, but require rigorous external validation in clinical settings.
<<<
翻译