当前共找到 1194 篇文献分享,本页显示第 1021 - 1040 篇。
1021.
尹志
(2022-06-27 08:22):
#paper doi:10.1016/j.tics.2021.11.008 Trends in Cognitive Sciences, Vol 26, Issue 2, 2022, Next-generation deep learning based on simulators and synthetic data. 目前的主流的深度学习应用主要利用了监督学习的技术,但这需要大量的有标注的数据,考虑到获取大量有标注数据的困难(经济上、效率上),这就成为了深度学习发展的瓶颈。为了解决这个问题,一个有可能的解决方案是充分利用合成数据。本文就综述了这一主题的情况。文章将合成数据的来源分为了三种类型,分别是渲染方式下产生的,简单的说就是在各类建模渲染过程中产生的;各类生成模型产生的;融合模型产生的。再具体一点,第一类是模拟建模过程产生的,其具有较好的物理背景和流程;第二类是各类具有统计背景的生成模型基于对数据的分布进行的估计产生的;第三类则是将不同的domain的数据进行融合产生的,比如将前景域和背景域做各种融合。当然,考虑到合成数据和真实数据还存在很多gap,因此类似域适配这样的技术也在不断发展,使得合成数据更好的被使用。除此之外,这些合成数据的生成方案,大量借鉴了人类自然学习的模式,因此也促成了双向发展的趋势。即,数据合成的方案上不断借鉴自然学习的各种特点,而数据合成的研究也不断反向推动生物系统的各种性质的理解。最后,文章总结了利用合成数据进行科学探索、物理学研究、多模态学习等领域的特点及相关挑战,这一块的内容非常精炼,对相关主题感兴趣的小伙伴可以通过参考文献进行扩展,非常有价值的研究线索。
Abstract:
Deep learning (DL) is being successfully applied across multiple domains, yet these models learn in a most artificial way: they require large quantities of labeled data to grasp even simple …
>>>
Deep learning (DL) is being successfully applied across multiple domains, yet these models learn in a most artificial way: they require large quantities of labeled data to grasp even simple concepts. Thus, the main bottleneck is often access to supervised data. Here, we highlight a trend in a potential solution to this challenge: synthetic data. Synthetic data are becoming accessible due to progress in rendering pipelines, generative adversarial models, and fusion models. Moreover, advancements in domain adaptation techniques help close the statistical gap between synthetic and real data. Paradoxically, this artificial solution is also likely to enable more natural learning, as seen in biological systems, including continual, multimodal, and embodied learning. Complementary to this, simulators and deep neural networks (DNNs) will also have a critical role in providing insight into the cognitive and neural functioning of biological systems. We also review the strengths of, and opportunities and novel challenges associated with, synthetic data.
<<<
翻译
1022.
颜林林
(2022-06-27 00:24):
#paper doi:10.3390/diagnostics12061493 Diagnostics, 2022, MixPatch: A New Method for Training Histopathology Image Classifiers. 病理图像分析中,由于原始全片数据量太大(通常为5万x5万像素),很难直接丢入DNN模型,故通常会进行切分,形成大量图块(patch),逐一进行分析(训练或预测)。对于每个图块,一般会由病理医生进行注释,确定其临床特征(如是否恶性肿瘤区域)。该临床特征一般是“是或否”的二分状态。然而,事实上很多分块会同时包含良性或恶性的不同类型区域,这种“不确定”的图块,会造成模型的误判和性能损失。本文的研究,采取最小图块(128x128像素,被病理医生认为最小可识别区域),以便给出“干净”的金标准数据集,并在此基础上,合并相邻最小图块(一般9个或16个,即3x3或4x4),得到“混合的图块(mix patch)”,并根据组合前原始信息,给出对该“混合图块”的结果的可信度估计。这其实是个模糊集合的理念。而通过这般操作,使得病理分析的性能得到了提升,且在对全片水平(slide level)进行的预测中也取得了更好的结果。
Abstract:
CNN-based image processing has been actively applied to histopathological analysis to detect and classify cancerous tumors automatically. However, CNN-based classifiers generally predict a label with overconfidence, which becomes a serious …
>>>
CNN-based image processing has been actively applied to histopathological analysis to detect and classify cancerous tumors automatically. However, CNN-based classifiers generally predict a label with overconfidence, which becomes a serious problem in the medical domain. The objective of this study is to propose a new training method, called MixPatch, designed to improve a CNN-based classifier by specifically addressing the prediction uncertainty problem and examine its effectiveness in improving diagnosis performance in the context of histopathological image analysis. MixPatch generates and uses a new sub-training dataset, which consists of mixed-patches and their predefined ground-truth labels, for every single mini-batch. Mixed-patches are generated using a small size of clean patches confirmed by pathologists while their ground-truth labels are defined using a proportion-based soft labeling method. Our results obtained using a large histopathological image dataset shows that the proposed method performs better and alleviates overconfidence more effectively than any other method examined in the study. More specifically, our model showed 97.06% accuracy, an increase of 1.6% to 12.18%, while achieving 0.76% of expected calibration error, a decrease of 0.6% to 6.3%, over the other models. By specifically considering the mixed-region variation characteristics of histopathology images, MixPatch augments the extant mixed image methods for medical image analysis in which prediction uncertainty is a crucial issue. The proposed method provides a new way to systematically alleviate the overconfidence problem of CNN-based classifiers and improve their prediction accuracy, contributing toward more calibrated and reliable histopathology image analysis.
<<<
翻译
1023.
颜林林
(2022-06-26 22:13):
#paper doi:10.1371/journal.pcbi.1009730 PLOS Computational Biology, 2022, Improved transcriptome assembly using a hybrid of long and short reads with StringTie. 这篇文章来自Johns Hopkins,开发了一个能够混合使用长读长及短读长测序数据进行转录组拼装的工具。高通量测序数据中,短读长平台的准确性高,但读长较短,难以覆盖完整转录本,而长读长平台虽然可以跨越多个外显子,帮助确定转录本剪切方式,但由于碱基准确度相对较差,因而也容易在比对时造成错误,影响转录本的确定。本文的工具,展示了由于测序错误导致的“嘈杂”比对,以及由此导致的搜索空间大幅增加。通过使用图论中的最大流量问题的解法,以及在“嘈杂”比对局部使用更准确的短读长数据,帮助确定正确的剪切位点,从而实现综合两种平台(长读长与短读长)的优势,且运算速度也并不弱于以往使用单一数据的工具算法。为评估此工具,本文除了使用模拟数据外,同时也选择了拟南芥、小鼠和人的多套真实数据集,在组装精读和输出的可正确注释的转录本等方面,都表现出符合预期的更好成绩。
Abstract:
Short-read RNA sequencing and long-read RNA sequencing each have their strengths and weaknesses for transcriptome assembly. While short reads are highly accurate, they are rarely able to span multiple exons. …
>>>
Short-read RNA sequencing and long-read RNA sequencing each have their strengths and weaknesses for transcriptome assembly. While short reads are highly accurate, they are rarely able to span multiple exons. Long-read technology can capture full-length transcripts, but its relatively high error rate often leads to mis-identified splice sites. Here we present a new release of StringTie that performs hybrid-read assembly. By taking advantage of the strengths of both long and short reads, hybrid-read assembly with StringTie is more accurate than long-read only or short-read only assembly, and on some datasets it can more than double the number of correctly assembled transcripts, while obtaining substantially higher precision than the long-read data assembly alone. Here we demonstrate the improved accuracy on simulated data and real data from Arabidopsis thaliana, Mus musculus, and human. We also show that hybrid-read assembly is more accurate than correcting long reads prior to assembly while also being substantially faster. StringTie is freely available as open source software at https://github.com/gpertea/stringtie.
<<<
翻译
1024.
颜林林
(2022-06-25 20:26):
#paper doi:10.3390/s22124409 Sensors, 2022, Deep Neural Networks Applied to Stock Market Sentiment Analysis. 这篇来自葡萄牙的关于深度学习技术应用的论文,被发现和推送自PubMed(PMID:35746192)。文章主要介绍了如何使用深度神经网络,从社交网站(Twitter、Reddit等)的文字内容,推断其情绪分类(积极或消极),并利用此情绪结果,进行模拟投资,以评估其投资收益率。文章内容算不上有太多创新价值,不过其认真介绍DL技术原理、实现和评估过程,倒是有点像一篇教程。反而是关于股市及投资的内容,显得有些割裂,像是强行补充。因为其深度模型的性能评估,也还是仅仅针对情绪分类进行的。作者在文末展望之处还提到,后续打算引入数据流技术(data streaming technology),使该分析过程能够实时进行,倒或许会指出更多合适的新应用场景。
Abstract:
The volume of data is growing exponentially and becoming more valuable to organizations that collect it, from e-commerce data, shipping, audio and video logs, text messages, internet search queries, stock …
>>>
The volume of data is growing exponentially and becoming more valuable to organizations that collect it, from e-commerce data, shipping, audio and video logs, text messages, internet search queries, stock market activity, financial transactions, the Internet of Things, and various other sources. The major challenges are related with the way to extract insights from such a rich data environment and whether Deep Learning can be successful with Big Data. To get some insight on these topics, social network data are employed as a case study on how sentiments can affect decisions in stock market environments. In this paper, we propose a generalized Deep Learning-based classification framework for Stock Market Sentiment Analysis. This work comprises the study, the development, and implementation of an automatic classification system based on Deep Learning and the validation of its adequacy and efficiency in any scenario, particularly Stock Market Sentiment Analysis. Distinct datasets and several Deep Learning approaches with different layers and embedded techniques are used, and their performances are evaluated. These developments show how Deep Learning reacts to distinct contexts. The results also give context on how different techniques with different parameter combinations react to certain types of data. Convolution obtained the best results when dealing with complex data inputs, and long short-term layers kept a memory of data, allowing inputs which are not as common to still be considered for decisions. The models that resulted from Stock Market Sentiment Analysis datasets were applied with some success to real-life problems. The best models reached accuracies of 73% in training and 69% in certain test datasets. In a simulation, a model was able to provide a Return on Investment of 4.4%. The results contribute to understanding how to process Big Data efficiently using Deep Learning and specialized hardware techniques.
<<<
翻译
1025.
张浩彬
(2022-06-25 15:38):
#paper doi:10.1007/s11356-021-17442-1,A systematic literature review of deep learning neural network for time series air quality forecasting
21年关于深度学习用于大气污染物预测的文章。算是很全面地从深度学习的角度总结了各种大气污染预测的方法,主要包括单模型、混合模型、时空网络以及结合序列分解进行深度学习预测等四个方面,并对每个方面的相关论文进行了讨论总结,相对比较详尽。美中不足的是,针对这四个方面的相互比较,作者的笔墨较少。
Environmental science and pollution research international,
2022-Jan.
DOI: 10.1007/s11356-021-17442-1
PMID: 34807385
Abstract:
Rapid progress of industrial development, urbanization and traffic has caused air quality reduction that negatively affects human health and environmental sustainability, especially among developed countries. Numerous studies on the development …
>>>
Rapid progress of industrial development, urbanization and traffic has caused air quality reduction that negatively affects human health and environmental sustainability, especially among developed countries. Numerous studies on the development of air quality forecasting model using machine learning have been conducted to control air pollution. As such, there are significant numbers of reviews on the application of machine learning in air quality forecasting. Shallow architectures of machine learning exhibit several limitations and yield lower forecasting accuracy than deep learning architecture. Deep learning is a new technology in computational intelligence; thus, its application in air quality forecasting is still limited. This study aims to investigate the deep learning applications in time series air quality forecasting. Owing to this, literature search is conducted thoroughly from all scientific databases to avoid unnecessary clutter. This study summarizes and discusses different types of deep learning algorithms applied in air quality forecasting, including the theoretical backgrounds, hyperparameters, applications and limitations. Hybrid deep learning with data decomposition, optimization algorithm and spatiotemporal models are also presented to highlight those techniques' effectiveness in tackling the drawbacks of individual deep learning models. It is clearly stated that hybrid deep learning was able to forecast future air quality with higher accuracy than individual models. At the end of the study, some possible research directions are suggested for future model development. The main objective of this review study is to provide a comprehensive literature summary of deep learning applications in time series air quality forecasting that may benefit interested researchers for subsequent research.
<<<
翻译
1026.
白义民
(2022-06-25 14:36):
#paper 邓晓芒《语言的形上学原理》,语言是一种日用工具,关于语言哲学的探讨有助于把握好这个工具。本文通过指出语言的两个负面性质:自否定和自欺,从语言游戏的视角,来阐释概念现象世界的认知局限性,以及绝对真理胜义谛的不可言说;进一步从语言的诗学性质:言不及义,意在言外——来表明对认知局限的超越,对绝对真理意象世界的意义领略。
清华大学学报(哲学社会科学版),
2022.
DOI: 10.13613/j.cnki.qhdz.003157
Abstract:
在语言的形而上学中,最值得关注的有三大基本原理:一、语言的自否定本质,语言本质上是辩证法的,它的“是”即蕴含着“不是”,任何“真话”都隐含着“谎言”,否则不成其为语言;二、语言的自欺功能,有意识的自欺或假扮游戏是语言的灵魂和生命,它基于人类自我意识的自欺结构,同时又给这种结构提供了现实的确证;三、语言的修辞学或诗学属性,一切语言都由诗性而发生,这也是语言中的语法和逻辑功能的起源。我所设想的“语言学之后”的形而上学所要探讨的正是语言的诗性功能和逻辑功能的关系,双方不仅是“对立统一”的关系,而且处于“自否定”的辩证进展中,这构成了“语言学之后”的最基本的原理。
>>>
在语言的形而上学中,最值得关注的有三大基本原理:一、语言的自否定本质,语言本质上是辩证法的,它的“是”即蕴含着“不是”,任何“真话”都隐含着“谎言”,否则不成其为语言;二、语言的自欺功能,有意识的自欺或假扮游戏是语言的灵魂和生命,它基于人类自我意识的自欺结构,同时又给这种结构提供了现实的确证;三、语言的修辞学或诗学属性,一切语言都由诗性而发生,这也是语言中的语法和逻辑功能的起源。我所设想的“语言学之后”的形而上学所要探讨的正是语言的诗性功能和逻辑功能的关系,双方不仅是“对立统一”的关系,而且处于“自否定”的辩证进展中,这构成了“语言学之后”的最基本的原理。
<<<
翻译
1027.
颜林林
(2022-06-24 21:32):
#paper doi:10.1038/s41587-022-01294-2 Nature Biotechnology, 2022, The clinical progress of mRNA vaccines and immunotherapies. 这是一篇关于mRNA疫苗的长篇综述。使用mRNA作为载体开发疫苗的概念,始于1990年,它通过借用接种者身体内的蛋白质翻译机制来产生靶蛋白,而非直接注射(灭活或减活)病原体或靶蛋白本身。这种方式带来一系列优点,诸如设计简便、固有免疫原性、可快速量产等。当然,它也存在诸如稳定性差、疫苗在体内递送至目标位置困难等缺点或挑战。在新冠疫情爆发以来的这三年里,借着大量资金投入增加、紧急使用授权等机会,mRNA疫苗的研发及投产使用得到了极大加速。本文对这些发展,包括给药递送方法,针对传染病的疫苗研发、使用及优化,针对癌症治疗的疫苗方法,mRNA疫苗在蛋白质和细胞免疫治疗中的使用等,都做了比较详细的综述介绍,并据此讨论了当前存在的问题和未来研发方向。通篇读下来,能对mRNA疫苗及其技术路线形成比较深入的了解,也确实能体会到这是个潜力巨大、值得探索和继续研发的重要技术体系。
Abstract:
The emergency use authorizations (EUAs) of two mRNA-based severe acute respiratory syndrome coronavirus (SARS-CoV)-2 vaccines approximately 11 months after publication of the viral sequence highlights the transformative potential of this …
>>>
The emergency use authorizations (EUAs) of two mRNA-based severe acute respiratory syndrome coronavirus (SARS-CoV)-2 vaccines approximately 11 months after publication of the viral sequence highlights the transformative potential of this nucleic acid technology. Most clinical applications of mRNA to date have focused on vaccines for infectious disease and cancer for which low doses, low protein expression and local delivery can be effective because of the inherent immunostimulatory properties of some mRNA species and formulations. In addition, work on mRNA-encoded protein or cellular immunotherapies has also begun, for which minimal immune stimulation, high protein expression in target cells and tissues, and the need for repeated administration have led to additional manufacturing and formulation challenges for clinical translation. Building on this momentum, the past year has seen clinical progress with second-generation coronavirus disease 2019 (COVID-19) vaccines, Omicron-specific boosters and vaccines against seasonal influenza, Epstein-Barr virus, human immunodeficiency virus (HIV) and cancer. Here we review the clinical progress of mRNA therapy as well as provide an overview and future outlook of the transformative technology behind these mRNA-based drugs.
<<<
翻译
1028.
张德祥
(2022-06-23 09:27):
#paper https://doi.org/10.1016/j.biosystems.2022.104714 Neurons as hierarchies of quantum reference frames
Author links open overlay panel 神经元的概念和模型已经落后于经验数据几十年了,现在的神经网络的启发概念是几十年前的模型,人工智能现在很需要从生物高效的神经元模型获得启发,这篇论文用量子信息论的工具扩展现在的神经元模型,这种表示法中量子参考系扮演了层次主动推理的模型,生物计算是否跟量子有关还存在很多争议,这篇论文也列举了部分证据数据。期待生物启发的高效神经元模型的出现。
Abstract:
Conceptual and mathematical models of neurons have lagged behind empirical understanding for decades. Here we extend previous work in modeling biological systems with fully scale-independent quantum information-theoretic tools to develop …
>>>
Conceptual and mathematical models of neurons have lagged behind empirical understanding for decades. Here we extend previous work in modeling biological systems with fully scale-independent quantum information-theoretic tools to develop a uniform, scalable representation of synapses, dendritic and axonal processes, neurons, and local networks of neurons. In this representation, hierarchies of quantum reference frames act as hierarchical active-inference systems. The resulting model enables specific predictions of correlations between synaptic activity, dendritic remodeling, and trophic reward. We summarize how the model may be generalized to nonneural cells and tissues in developmental and regenerative contexts.
<<<
翻译
1029.
颜林林
(2022-06-23 07:02):
#paper doi:10.1186/s12859-022-04768-x BMC Bioinformatics, 2022, Using BERT to identify drug-target interactions from whole PubMed. 这篇文章通过使用自然语言处理技术中BERT模型,批量分析了PubMed和PMC的全数据库,从文章中识别出药物和蛋白质信息,并提取药物-靶点相互作用(DTI)数据,包括对应所使用的实验方法类别等重要信息。通过本文的方法,新识别出的60万篇文章,都未被公共DTI数据库所包含。通过人工抽查审核和较差验证的方法,确认了该方法的准确度(99%以上)。通常这类数据的文献挖掘和整理,都依赖于人工,在效率上存在很大局限。诸如本文的人工智能方法,将为药物发现和重定位、加快药物开发等提供帮助。
Abstract:
BACKGROUND: Drug-target interactions (DTIs) are critical for drug repurposing and elucidation of drug mechanisms, and are manually curated by large databases, such as ChEMBL, BindingDB, DrugBank and DrugTargetCommons. However, the …
>>>
BACKGROUND: Drug-target interactions (DTIs) are critical for drug repurposing and elucidation of drug mechanisms, and are manually curated by large databases, such as ChEMBL, BindingDB, DrugBank and DrugTargetCommons. However, the number of curated articles likely constitutes only a fraction of all the articles that contain experimentally determined DTIs. Finding such articles and extracting the experimental information is a challenging task, and there is a pressing need for systematic approaches to assist the curation of DTIs. To this end, we applied Bidirectional Encoder Representations from Transformers (BERT) to identify such articles. Because DTI data intimately depends on the type of assays used to generate it, we also aimed to incorporate functions to predict the assay format.RESULTS: Our novel method identified 0.6 million articles (along with drug and protein information) which are not previously included in public DTI databases. Using 10-fold cross-validation, we obtained ~ 99% accuracy for identifying articles containing quantitative drug-target profiles. The F1 micro for the prediction of assay format is 88%, which leaves room for improvement in future studies.CONCLUSION: The BERT model in this study is robust and the proposed pipeline can be used to identify previously overlooked articles containing quantitative DTIs. Overall, our method provides a significant advancement in machine-assisted DTI extraction and curation. We expect it to be a useful addition to drug mechanism discovery and repurposing.
<<<
翻译
1030.
张德祥
(2022-06-22 20:03):
#paper https://doi.org/10.1016/j.bbrc.2020.10.077 Life, death, and self: Fundamental questions of primitive cognition viewed through the lens of body plasticity and synthetic organisms 形体改变对认知会产生什么影响?化茧成蝶,蝌蚪青蛙,随着脑机接口的发展,人类也会逐渐体会到形体改变对我们的影响。这是一个新的跨学科领域,位于认知科学、再生生物学、合成生物工程和神经科学的交叉点。通过连续的生命史解开身体和心灵的可塑性。随着人工生命的发展,这个领域会有更大的发展。
IF:2.500Q3
Biochemical and biophysical research communications,
2021-07-30.
DOI: 10.1016/j.bbrc.2020.10.077
PMID: 33162026
Abstract:
Central to the study of cognition is being able to specify the Subject that is making decisions and owning memories and preferences. However, all real cognitive agents are made of …
>>>
Central to the study of cognition is being able to specify the Subject that is making decisions and owning memories and preferences. However, all real cognitive agents are made of parts (such as brains made of cells). The integration of many active subunits into a coherent Self appearing at a larger scale of organization is one of the fundamental questions of evolutionary cognitive science. Typical biological model systems, whether basal or advanced, have a static anatomical structure which obscures important aspects of the mind-body relationship. Recent advances in bioengineering now make it possible to assemble, disassemble, and recombine biological structures at the cell, organ, and whole organism levels. Regenerative biology and controlled chimerism reveal that studies of cognition in intact, "standard", evolved animal bodies are just a narrow slice of a much bigger and as-yet largely unexplored reality: the incredible plasticity of dynamic morphogenesis of biological forms that house and support diverse types of cognition. The ability to produce living organisms in novel configurations makes clear that traditional concepts, such as body, organism, genetic lineage, death, and memory are not as well-defined as commonly thought, and need considerable revision to account for the possible spectrum of living entities. Here, I review fascinating examples of experimental biology illustrating that the boundaries demarcating somatic and cognitive Selves are fluid, providing an opportunity to sharpen inquiries about how evolution exploits physical forces for multi-scale cognition. Developmental (pre-neural) bioelectricity contributes a novel perspective on how the dynamic control of growth and form of the body evolved into sophisticated cognitive capabilities. Most importantly, the development of functional biobots - synthetic living machines with behavioral capacity - provides a roadmap for greatly expanding our understanding of the origin and capacities of cognition in all of its possible material implementations, especially those that emerge de novo, with no lengthy evolutionary history of matching behavioral programs to bodyplan. Viewing fundamental questions through the lens of new, constructed living forms will have diverse impacts, not only in basic evolutionary biology and cognitive science, but also in regenerative medicine of the brain and in artificial intelligence.
<<<
翻译
1031.
张德祥
(2022-06-22 19:56):
#paper https://doi.org/10.1093/nc/niab013 Minimal physicalism as a scale-free substrate for cognition and consciousness 论文提出意识和认知的无尺度表征,认为高级生物的认知和意识可以追溯到细菌等基础系统的认知反应,论文借鉴了量子生物学等进展,最小物理讲信息和能量视为形式等价,最小物理无标度,适用分子细胞有机体等尺度;最小物理由信息交换、马尔科夫链、参考系组成。论文最后提出了17个预测,剪切内容可以参考 https://mp.weixin.qq.com/s/JoajaXP0plzmfmBHSuzYaQ
Abstract:
Theories of consciousness and cognition that assume a neural substrate automatically regard phylogenetically basal, nonneural systems as nonconscious and noncognitive. Here, we advance a scale-free characterization of consciousness and cognition …
>>>
Theories of consciousness and cognition that assume a neural substrate automatically regard phylogenetically basal, nonneural systems as nonconscious and noncognitive. Here, we advance a scale-free characterization of consciousness and cognition that regards basal systems, including synthetic constructs, as not only informative about the structure and function of experience in more complex systems but also as offering distinct advantages for experimental manipulation. Our "minimal physicalist" approach makes no assumptions beyond those of quantum information theory, and hence is applicable from the molecular scale upwards. We show that standard concepts including integrated information, state broadcasting via small-world networks, and hierarchical Bayesian inference emerge naturally in this setting, and that common phenomena including stigmergic memory, perceptual coarse-graining, and attention switching follow directly from the thermodynamic requirements of classical computation. We show that the self-representation that lies at the heart of human autonoetic awareness can be traced as far back as, and serves the same basic functions as, the stress response in bacteria and other basal systems.
<<<
翻译
1032.
张德祥
(2022-06-22 19:35):
#paper https://doi.org/10.1016/j.pbiomolbio.2022.05.006 A free energy principle for generic quantum systems 自由能作为一个无尺度的概念框架,这篇论文讲FEP的适用范围扩展到了一般的量子系统,论文表明量子生物的领域比现在了解的要大的多。预测分子级别的动力学实现了量子信息处理。技术细节没有基础,个人感觉难度比较大,欢迎朋友一起研读。
Progress in biophysics and molecular biology,
2022-09.
DOI: 10.1016/j.pbiomolbio.2022.05.006
PMID: 35618044
Abstract:
The Free Energy Principle (FEP) states that under suitable conditions of weak coupling, random dynamical systems with sufficient degrees of freedom will behave so as to minimize an upper bound, …
>>>
The Free Energy Principle (FEP) states that under suitable conditions of weak coupling, random dynamical systems with sufficient degrees of freedom will behave so as to minimize an upper bound, formalized as a variational free energy, on surprisal (a.k.a., self-information). This upper bound can be read as a Bayesian prediction error. Equivalently, its negative is a lower bound on Bayesian model evidence (a.k.a., marginal likelihood). In short, certain random dynamical systems evince a kind of self-evidencing. Here, we reformulate the FEP in the formal setting of spacetime-background free, scale-free quantum information theory. We show how generic quantum systems can be regarded as observers, which with the standard freedom of choice assumption become agents capable of assigning semantics to observational outcomes. We show how such agents minimize Bayesian prediction error in environments characterized by uncertainty, insufficient learning, and quantum contextuality. We show that in its quantum-theoretic formulation, the FEP is asymptotically equivalent to the Principle of Unitarity. Based on these results, we suggest that biological systems employ quantum coherence as a computational resource and - implicitly - as a communication resource. We summarize a number of problems for future research, particularly involving the resources required for classical communication and for detecting and responding to quantum context switches.
<<<
翻译
1033.
张德祥
(2022-06-22 19:20):
#paper DOI https://doi.org/10.1007/s11229-022-03480-w Active inference models do not contradict folk psychology 主动推理作为一个大框架,能否跟传统心理学中的各种高级概念兼容?这篇论文对两者进行了对比分析,结论是主动推理架构跟传统心理学的高级概念: 信念 愿望 意图兼容,虽然主动推理的数学中没有这些概念的直接定义,但是主动推理的公式中概念的含义可以跟这些高级心理概念对应。
Synthese,
2022.
DOI: 10.1007/s11229-022-03480-w
Abstract:
AbstractActive inference offers a unified theory of perception, learning, and decision-making at computational and neural levels of description. In this article, we address the worry that active inference may be …
>>>
AbstractActive inference offers a unified theory of perception, learning, and decision-making at computational and neural levels of description. In this article, we address the worry that active inference may be in tension with the belief–desire–intention (BDI) model within folk psychology because it does not include terms for desires (or other conative constructs) at the mathematical level of description. To resolve this concern, we first provide a brief review of the historical progression from predictive coding to active inference, enabling us to distinguish between active inference formulations of motor control (which need not have desires under folk psychology) and active inference formulations of decision processes (which do have desires within folk psychology). We then show that, despite a superficial tension when viewed at the mathematical level of description, the active inference formalism contains terms that are readily identifiable as encoding both the objects of desire and the strength of desire at the psychological level of description. We demonstrate this with simple simulations of an active inference agent motivated to leave a dark room for different reasons. Despite their consistency, we further show how active inference may increase the granularity of folk-psychological descriptions by highlighting distinctions between drives to seek information versus reward—and how it may also offer more precise, quantitative folk-psychological predictions. Finally, we consider how the implicitly conative components of active inference may have partial analogues (i.e., “as if” desires) in other systems describable by the broader free energy principle to which it conforms.
<<<
翻译
1034.
颜林林
(2022-06-22 00:43):
#paper doi:10.1038/s41591-022-01768-5 Nature Medicine, 2022, Swarm learning for decentralized artificial intelligence in cancer histopathology. 前段时间刚在Nature上一篇文章(doi:10.1038/s41586-021-03583-3)读到Swarm learning(群体学习),该文提及一种在不违反隐私法规的前提下进行临床数据共享,从而帮助针对那些普遍存在异质性的疾病开展精准医学研究。本文则是针对肿瘤病理图像分析,也使用群体学习技术。病理图像分析,是典型的需要依赖大量高质量数据集的研究方向,群体学习正好使得合作单位可以共同训练AI模型,同时又避免数据传输和数据垄断。本文基于来自爱尔兰、德国和美国的三个结直肠癌患者队列训练了模型,该模型通过分析患者的H&E染色切片,预测其驱动基因突变、dMMR突变和微卫星不稳定性状态(MSI)等,并在来自英国的两个独立队列数据集中进行模型的性能验证。在训练模型的三个数据节点(研究中心)之间,并不直接传递原始数据,而是在每次迭代步骤中,通过去中心化的区块链技术,进行模型参数的同步。也因此,各数据节点之间是对等的,并没有需要汇总其他节点的特殊中心节点。这种模式为将来拓展到更大范围、更多机构的合作,提供了可能性,也将使病理图像分析模型得到更大进步。
IF:58.700Q1
Nature medicine,
2022-06.
DOI: 10.1038/s41591-022-01768-5
PMID: 35469069
PMCID:PMC9205774
Abstract:
Artificial intelligence (AI) can predict the presence of molecular alterations directly from routine histopathology slides. However, training robust AI systems requires large datasets for which data collection faces practical, ethical …
>>>
Artificial intelligence (AI) can predict the presence of molecular alterations directly from routine histopathology slides. However, training robust AI systems requires large datasets for which data collection faces practical, ethical and legal obstacles. These obstacles could be overcome with swarm learning (SL), in which partners jointly train AI models while avoiding data transfer and monopolistic data governance. Here, we demonstrate the successful use of SL in large, multicentric datasets of gigapixel histopathology images from over 5,000 patients. We show that AI models trained using SL can predict BRAF mutational status and microsatellite instability directly from hematoxylin and eosin (H&E)-stained pathology slides of colorectal cancer. We trained AI models on three patient cohorts from Northern Ireland, Germany and the United States, and validated the prediction performance in two independent datasets from the United Kingdom. Our data show that SL-trained AI models outperform most locally trained models, and perform on par with models that are trained on the merged datasets. In addition, we show that SL-based AI models are data efficient. In the future, SL can be used to train distributed AI models for any histopathology image analysis task, eliminating the need for data transfer.
<<<
翻译
1035.
颜林林
(2022-06-21 00:03):
#paper doi:10.1016/j.jmoldx.2022.05.003 The Journal of Molecular Diagnostics, 2022, Comprehensive Validation of Diagnostic Next-Generation Sequencing Panels for Acute Myeloid Leukemia (AML) Patients. 这是来自瑞士和德国的一篇关于血液肿瘤基因检测panel验证的文章。通常认为,肿瘤是遗传病,即由于遗传物质发生突变而导致的疾病。因此,在诊断和治疗决策过程中,会需要开展特定基因的检测。在临床实践上,可以采取panel富集特定DNA片段进行测序的方法,这也是目前肿瘤相关基因检测商业服务的基本模式。这种检测服务得以上市的前提,是需要经过充分的验证。本文便是这样一个验证过程的实例。本文的验证对象,是为诊断AML(急性髓系白血病)的panel,验证过程纳入了26例AML患者的33个DNA样本(骨髓或外周血),以及Acrometrix Oncology Hotspot Control DNA作为对照。对这些样本中携带的AML相关突变进行了检测和性能评价。而临床样本中的突变,也采用qPCR、Sanger测序等方法进行了确认。通过评估,从四个不同panel及多种分析软件中,选出了针对血液病性能最佳的panel及软件组合。
The Journal of molecular diagnostics : JMD,
2022-08.
DOI: 10.1016/j.jmoldx.2022.05.003
PMID: 35718092
Abstract:
Next-generation sequencing has greatly advanced the molecular diagnostics of malignant hematological diseases and provides useful information for clinical decision making. Studies have shown that certain mutations are associated with prognosis …
>>>
Next-generation sequencing has greatly advanced the molecular diagnostics of malignant hematological diseases and provides useful information for clinical decision making. Studies have shown that certain mutations are associated with prognosis and have a direct impact on treatment of affected patients. Therefore, reliable detection of pathogenic variants is critically important. Here, we compared four sequencing panels with different characteristics, from number of genes covered to technical aspects of library preparation and data analysis workflows, to find the panel with the best clinical utility for myeloid neoplasms with a special focus on acute myeloid leukemia. Using the Acrometrix Oncology Hotspot Control DNA and DNA from acute myeloid leukemia patients, panel performance was evaluated in terms of coverage, precision, recall, and reproducibility and different bioinformatics tools that can be used for the evaluation of any next-generation sequencing panel were tested. Taken together, our results support the reliability of the Acrometrix Oncology Hotspot Control to validate and compare sequencing panels for hematological diseases and show which panel-software combination (platform) has the best performance.
<<<
翻译
1036.
张浩彬
(2022-06-20 08:27):
#paper doi:10.1145/3219819.3219822,Deep Distributed Fusion Network for Air Quality Prediction
.2018年的kdd论文,从现在的角度看,或者从当时的角度看,作者所构建的这个网络都并不复杂。这个网络主要包括2个组件:空间变换组件以及深度分布融合组件。
1.设计了一个空间变换组件,将空间稀疏的空气质量数据转换为模拟二手污染物源的一致输入, 借助来自空间邻居的信号(根据分布方位分为远近及东西南北向供16个),DeepAir 在一般情况和突变情况下具有更好的性能。
2.考虑到直接和间接因素对空气质量的影响不同,分别使用每个间接因素与直接因素进行一个子网络构建,以及构建一个整体子网络,最后进行融合。。
3.这个论文的网络结构虽然不复杂,但是却很贴近业务。是基于业务的基础上对网络进行设计的。作者基于 9 个中国城市的三年数据,结果表明 DeepAir 与 10 个基线相比具有优势。 在短期、长期和突变预测方面的相对准确度分别提高了 2.4%、12.2%、63.2%。
Abstract:
Accompanying the rapid urbanization, many developing countries are suffering from serious air pollution problem. The demand for predicting future air quality is becoming increasingly more important to government's policy-making and …
>>>
Accompanying the rapid urbanization, many developing countries are suffering from serious air pollution problem. The demand for predicting future air quality is becoming increasingly more important to government's policy-making and people's decision making. In this paper, we predict the air quality of next 48 hours for each monitoring station, considering air quality data, meteorology data, and weather forecast data. Based on the domain knowledge about air pollution, we propose a deep neural network (DNN)-based approach (entitled DeepAir), which consists of a spatial transformation component and a deep distributed fusion network. Considering air pollutants' spatial correlations, the former component converts the spatial sparse air quality data into a consistent input to simulate the pollutant sources. The latter network adopts a neural distributed architecture to fuse heterogeneous urban data for simultaneously capturing the factors affecting air quality, e.g. meteorological conditions. We deployed DeepAir in our AirPollutionPrediction system, providing fine-grained air quality forecasts for 300+ Chinese cities every hour. The experimental results on the data from three-year nine Chinese-city demonstrate the advantages of DeepAir beyond 10 baseline methods. Comparing with the previous online approach in AirPollutionPrediction system, we have 2.4%, 12.2%, 63.2% relative accuracy improvements on short-term, long-term and sudden changes prediction, respectively.
<<<
翻译
1037.
颜林林
(2022-06-20 07:48):
#paper doi:10.1016/j.gpb.2022.03.002 Genomics, Proteomics & Bioinformatics, 2022, Cancer is a survival process under persistent microenvironmental and cellular stresses. 这篇综述是关于癌症发生发展的机制,提出了一个新的框架看法。相较于传统以突变为核心的理解,该新看法的关键点在于,认为癌细胞的持续分裂是其生存的“必须”行为,而非仅受遗传物质突变所指导的被动结果。针对这个看法,文章从代谢模式变化、胞质pH状态、慢性炎症、过量铁积累负荷、芬顿反应等角度分别进行了阐释。对于某些癌种随年龄增长其发病率反而下降,以及某些物种很少发生或几乎不会发生癌症,这种看法也提供了新的解释。
Genomics, proteomics & bioinformatics,
2023-12.
DOI: 10.1016/j.gpb.2022.03.002
PMID: 35728722
PMCID:PMC11082257
Abstract:
No abstract available.
1038.
颜林林
(2022-06-19 00:14):
#paper doi:10.1186/s13073-022-01069-z Genome Medicine, 2022, Reanalysis of exome negative patients with rare disease: a pragmatic workflow for diagnostic applications. 过去这些年里,我们经常会对罕见遗传病患者开展全外显子组测序,以便确认其致病基因并形成诊断结论。然而,受限于技术和积累的知识,大部分患者在测序后也仍然无法确诊。这篇来自荷兰拉德堡德大学(Radboud University)的文章,回顾了其医学中心在2011年11月至2015年1月期间到访的疑似罹患复杂神经系统遗传疾病的150名儿童患者,对其中103名未得到确诊的患者进行了随访调查,通过重新查阅评估表型信息、重新分析其全外显子测序数据,以及对仍无法确诊的患者(使用新的实验流程和外显子panel)重新进行测序和分析。这一系列操作,让32名之前未被诊断的患者得到确诊,诊断率从31%(47/150)提升到53%(79/150)。其结果也支持了在临床护理和后续随访过程中,应该对未确诊患者进行重新分析和系统评估,新的临床证据信息、新的技术方法和分析方法,都有助于改善诊治,使患者获益。
Abstract:
BACKGROUND: Approximately two third of patients with a rare genetic disease remain undiagnosed after exome sequencing (ES). As part of our post-test counseling procedures, patients without a conclusive diagnosis are …
>>>
BACKGROUND: Approximately two third of patients with a rare genetic disease remain undiagnosed after exome sequencing (ES). As part of our post-test counseling procedures, patients without a conclusive diagnosis are advised to recontact their referring clinician to discuss new diagnostic opportunities in due time. We performed a systematic study of genetically undiagnosed patients 5 years after their initial negative ES report to determine the efficiency of diverse reanalysis strategies.METHODS: We revisited a cohort of 150 pediatric neurology patients originally enrolled at Radboud University Medical Center, of whom 103 initially remained genetically undiagnosed. We monitored uptake of physician-initiated routine clinical and/or genetic re-evaluation (ad hoc re-evaluation) and performed systematic reanalysis, including ES-based resequencing, of all genetically undiagnosed patients (systematic re-evaluation).RESULTS: Ad hoc re-evaluation was initiated for 45 of 103 patients and yielded 18 diagnoses (including 1 non-genetic). Subsequent systematic re-evaluation identified another 14 diagnoses, increasing the diagnostic yield in our cohort from 31% (47/150) to 53% (79/150). New genetic diagnoses were established by reclassification of previously identified variants (10%, 3/31), reanalysis with enhanced bioinformatic pipelines (19%, 6/31), improved coverage after resequencing (29%, 9/31), and new disease-gene associations (42%, 13/31). Crucially, our systematic study also showed that 11 of the 14 further conclusive genetic diagnoses were made in patients without a genetic diagnosis that did not recontact their referring clinician.CONCLUSIONS: We find that upon re-evaluation of undiagnosed patients, both reanalysis of existing ES data as well as resequencing strategies are needed to identify additional genetic diagnoses. Importantly, not all patients are routinely re-evaluated in clinical care, prolonging their diagnostic trajectory, unless systematic reanalysis is facilitated. We have translated our observations into considerations for systematic and ad hoc reanalysis in routine genetic care.
<<<
翻译
1039.
颜林林
(2022-06-18 14:39):
#paper doi:10.1021/acssynbio.2c00120 ACS Synthetic Biology, 2022, Graph Computation Using Algorithmic Self-Assembly of DNA Molecules. 利用DNA等生物分子进行计算,可以追溯至上世纪90年代初,该领域这些年来不断进步并取得新成果,本文便是这样的一个案例。本文另辟蹊径,使用了一种称为DNA折纸的技术(即通过精巧地设计DNA序列,使其折叠成为某种特定形状),来解决一个“六顶点三色涂色”的图论计算问题。宏观上极少量的生物物质,其实包含着数量庞大的分子,因而,使用这些分子进行计算,是一个天然能提供巨大算力的策略,可以很轻松实现大量排列组合的暴力穷举,这就是生物计算概念提出的基本出发点之一。虽说被称为“DNA computing”,但它其实还远不及我们日常认识的通用电子计算机。本文的研究,是在特定图论问题上,人为列举出各个待求顶点的所有可能颜色,以及利用DNA链互补特性,设计相应序列,实现控制哪些顶点之间可以互相连接的规则。然后大量合成这样的分子,使其在特定实验条件下自由组合,最终利用AFM(原子力显微镜)扫描,找到符合特定结构形状的答案。由于使用了DNA折纸技术,AFM可以直接观察并识别出各顶点的“颜色”及连接组合,从而给出问题的求解。文章所解决的问题,被限定在特定范围,且只是概念验证阶段,未来要扩展到更多应用场景,使其具备“通用”或一定程度“通用”的程度,还有很长的路要走。
Abstract:
DNA molecules have been used as novel computing tools, by which Synthetic DNA was designed to execute computing processes with a programmable sequence. Here, we proposed a parallel computing method …
>>>
DNA molecules have been used as novel computing tools, by which Synthetic DNA was designed to execute computing processes with a programmable sequence. Here, we proposed a parallel computing method using DNA origamis as agents to solve the three-color problem, an example of the graph problem. Each agent was fabricated with a DNA origami of ∼50 nm diameter and contained DNA probes with programmable sticky ends that execute preset computing processes. With the interaction of different nanoagents, DNA molecules self-assemble into spatial nanostructures, which embody the computation results of the three-color problem with polynomial numbers of computing nanoagents in a one-pot annealing step. The computing results were confirmed by atomic force microscopy. Our method is completely different from existing DNA computing methods in its computing algorithm, and it has an advantage in terms of computational complexity and results detection for solving graph problems.
<<<
翻译
1040.
颜林林
(2022-06-17 22:10):
#paper doi:10.1101/2022.06.12.495839 bioRxiv, 2022, Accurate Estimation of Molecular Counts from Amplicon Sequence Data with Unique Molecular Identifiers. 高通量测序数据中充满由PCR扩增和测序过程导致的错误,为解决此问题,人们通常会引入分子标签(UMI)技术,即用一段随机序列来标记出哪些序列来自同一原始模板分子,而哪些不是。很多工具在处理UMI时,都简单粗暴地将相同UMI的序列直接进行合并,而由于UMI序列本身也存在突变,会导致还原样本中原始模板分子信息的过程被误判。这个过程在扩增子测序(amplicon-seq)中尤为明显。本文通过构建一个单步隐马科夫模型(one step HMM),来处理PCR和测序过程中的错误,并用C语言实现了一套EM算法,对UMI测序数据的真实原始模板分子数进行估算。在模拟数据和真实数据中,分别进行了评测,对比既往其他类似工具,本文开发的工具(DAUMI),能有效识别出UMI冲突(UMI collision),表现出更优异的性能。
bioRxiv,
2022.
DOI: 10.1101/2022.06.12.495839
Abstract:
Motivation: Amplicon sequencing is widely applied to explore heterogeneity and rare variants in genetic populations. Resolving true biological variants and quantifying their abundance is crucial for downstream analyses, but measured …
>>>
Motivation: Amplicon sequencing is widely applied to explore heterogeneity and rare variants in genetic populations. Resolving true biological variants and quantifying their abundance is crucial for downstream analyses, but measured abundances are distorted by stochasticity and bias in amplification, plus errors during Polymerase Chain Reaction (PCR) and sequencing. One solution attaches Unique Molecular Identifiers (UMIs) to sample sequences before amplification eliminating amplification bias by clustering reads on UMI and counting clusters to quantify abundance. While modern methods improve over naive clustering by UMI identity, most do not account for UMI reuse, or collision, and they do not adequately model PCR and sequencing errors in the UMIs and sample sequences. Results: We introduce Deduplication and accurate Abundance estimation with UMIs (DAUMI), a probabilistic framework to detect true biological sequences and accurately estimate their deduplicated abundance from amplicon sequence data. DAUMI recognizes UMI collision, even on highly similar sequences, and detects and corrects most PCR and sequencing errors in the UMI and sampled sequences. We demonstrate DAUMI performs better on simulated and real data compared to other UMI-aware clustering methods. Availability: Source code is available at https://github.com/xiyupeng/AmpliCI-UMI.
<<<
翻译