当前共找到 1086 篇文献分享,本页显示第 1021 - 1040 篇。
1021.
白云飞 (2022-03-31 16:24):
#paper 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA. 论文地址:https://proceedings.neurips.cc/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf 项目文档:https://lightgbm.readthedocs.io/en/latest/Features.html 在LightGBM提出之前,最有名的GBDT工具就是XGBoost了,它是基于预排序方法的决策树算法。这种构建决策树的算法基本思想是:首先,对所有特征都按照特征的数值进行预排序。其次,在遍历分割点的时候用O(#data)的代价找到一个特征上的最好分割点。最后,在找到一个特征的最好分割点后,将数据分裂成左右子节点。 这样的预排序算法的优点是能精确地找到分割点。但是缺点也很明显:首先,空间消耗大。这样的算法需要保存数据的特征值,还保存了特征排序的结果(例如,为了后续快速的计算分割点,保存了排序后的索引),这就需要消耗训练数据两倍的内存。其次,时间上也有较大的开销,在遍历每一个分割点的时候,都需要进行分裂增益的计算,消耗的代价大。最后,对cache优化不友好。在预排序后,特征对梯度的访问是一种随机访问,并且不同的特征访问的顺序不一样,无法对cache进行优化。同时,在每一层长树的时候,需要随机访问一个行索引到叶子索引的数组,并且不同特征访问的顺序也不一样,也会造成较大的cache miss。 为了避免上述XGBoost的缺陷,并且能够在不损害准确率的条件下加快GBDT模型的训练速度,lightGBM在传统的GBDT算法上进行了如下优化: 基于Histogram的决策树算法。 单边梯度采样 Gradient-based One-Side Sampling(GOSS):使用GOSS可以减少大量只具有小梯度的数据实例,这样在计算信息增益的时候只利用剩下的具有高梯度的数据就可以了,相比XGBoost遍历所有特征值节省了不少时间和空间上的开销。 互斥特征捆绑 Exclusive Feature Bundling(EFB):使用EFB可以将许多互斥的特征绑定为一个特征,这样达到了降维的目的。 带深度限制的Leaf-wise的叶子生长策略:大多数GBDT工具使用低效的按层生长 (level-wise) 的决策树生长策略,因为它不加区分的对待同一层的叶子,带来了很多没必要的开销。实际上很多叶子的分裂增益较低,没必要进行搜索和分裂。LightGBM使用了带有深度限制的按叶子生长 (leaf-wise) 算法。 直接支持类别特征(Categorical Feature) 支持高效并行 Cache命中率优化 其中两个加速GBDT训练的算法:Gradient-based One Side Sampling (GOSS) 和 Exclusive Feature Bundling (EFB)。在不影响精度的情况下,两个算法分别减少了GBDT训练中所需的数据量和特征量,从而加速了GBDT的训练。 GOSS: 在每一次迭代前,利用了GBDT中的样本梯度和误差的关系,对训练样本进行采样: 对误差大(梯度绝对值大)的数据保留;对误差小的数据采样一个子集,但给这个子集的数据一个权重,让这个子集可以近似到误差小的数据的全集。这么采样出来的数据,既不损失误差大的样本,又在减少训练数据的同时不改变数据的分布,从而实现了在几乎不影响精度的情况下加速了训练。 EFB:在特征维度很大的数据上,特征空间一般是稀疏的。利用这个特征,我们可以无损地降低GBDT算法中需要遍历的特征数量,更确切地说,是降低构造特征直方图(训练GBDT的主要时间消耗)需要遍历的特征数量。在稀疏的特征空间中,很多特征是exclusive的(即在同一个样本里,这一组特征里最多只有一个特征不为0)。每一组exclusive feature都可以无损地合并成一个“大特征”。构造直方图的时候,遍历一个“大特征”可以得到一组exclusive feature的直方图。这样只需要遍历这些“大特征”就可以获取到所有特征的直方图,降低了需要遍历的特征量。这里还需要解决的是Exclusive feature的分组问题,这是一个NP问题,可以转成Graph Coloring (Graph coloring - Wikipedia) 问题,并用贪心的近似方法来求解。 值得一提的是,XGBoost 也实现了 histogram 算法,比原来presorted算法快了不少。但相比LightGBM,还是慢了一些,且内存占用还是比较大。
Abstract:
Gradient Boosting Decision Tree (GBDT) is a popular machine learning algorithm, and has quite a few effective implementations such as XGBoost and pGBRT. Although many engineering optimizations have been adopted … >>>
Gradient Boosting Decision Tree (GBDT) is a popular machine learning algorithm, and has quite a few effective implementations such as XGBoost and pGBRT. Although many engineering optimizations have been adopted in these implementations, the efficiency and scalability are still unsatisfactory when the feature dimension is high and data size is large. A major reason is that for each feature, they need to scan all the data instances to estimate the information gain of all possible split points, which is very time consuming. To tackle this problem, we propose two novel techniques: Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB). With GOSS, we exclude a significant proportion of data instances with small gradients, and only use the rest to estimate the information gain. We prove that, since the data instances with larger gradients play a more important role in the computation of information gain, GOSS can obtain quite accurate estimation of the information gain with a much smaller data size. With EFB, we bundle mutually exclusive features (i.e., they rarely take nonzero values simultaneously), to reduce the number of features. We prove that finding the optimal bundling of exclusive features is NP-hard, but a greedy algorithm can achieve quite good approximation ratio (and thus can effectively reduce the number of features without hurting the accuracy of split point determination by much). We call our new GBDT implementation with GOSS and EFB LightGBM. Our experiments on multiple public datasets show that, LightGBM speeds up the training process of conventional GBDT by up to over 20 times while achieving almost the same accuracy. <<<
翻译
1022.
小擎子 (2022-03-31 15:29):
#paper doi: 10.1126/science.aah5043 Science, 2017, Potential role of intratumor bacteria in mediating tumor resistance to the chemotherapeutic drug gemcitabine. 先前已经有研究发现支原体感染的肿瘤细胞培养物中的核苷分解代谢酶会损害抗癌药物吉西他滨的细胞抑制活性。该文献对其机制进行进一步探索,发现人类真皮成纤维细胞(HDF) 的抗生素治疗消除了吉西他滨代谢活性,但用M. hyorhinis(猪支原体)再次感染这些相同的HDF恢复了细胞条件培养基对吉西他滨的代谢;为了确定除支原体以外的细菌是否可以对吉西他滨产生耐药性,文献将分析扩展到 27 种细菌, 27 个物种中有 13 个消除了吉西他滨对 RKO 人结肠直肠癌细胞的影响。经实验发现,CDD是细菌抗吉西他滨的关键基因,CDD的异构体会影响细菌对吉西他滨的代谢能力,CDD L会显著抗吉西他滨,CDD S只有部分代谢吉西他滨的能力。主要是属于Gammaproteobacteria的细菌具有赋予CDD L介导的吉西他滨抗性的潜力。吉西他滨通常用于治疗胰腺导管腺癌 (PDAC),文献假设肿瘤内细菌可能导致这些肿瘤的耐药性,进而对PDAC进行了采样和检测,在测试的 113 个人类 PDAC 中,86 个(76%)对细菌呈阳性,主要是 Gammaproteobacteria。
Abstract:
Growing evidence suggests that microbes can influence the efficacy of cancer therapies. By studying colon cancer models, we found that bacteria can metabolize the chemotherapeutic drug gemcitabine (2',2'-difluorodeoxycytidine) into its … >>>
Growing evidence suggests that microbes can influence the efficacy of cancer therapies. By studying colon cancer models, we found that bacteria can metabolize the chemotherapeutic drug gemcitabine (2',2'-difluorodeoxycytidine) into its inactive form, 2',2'-difluorodeoxyuridine. Metabolism was dependent on the expression of a long isoform of the bacterial enzyme cytidine deaminase (CDD), seen primarily in Gammaproteobacteria. In a colon cancer mouse model, gemcitabine resistance was induced by intratumor Gammaproteobacteria, dependent on bacterial CDD expression, and abrogated by cotreatment with the antibiotic ciprofloxacin. Gemcitabine is commonly used to treat pancreatic ductal adenocarcinoma (PDAC), and we hypothesized that intratumor bacteria might contribute to drug resistance of these tumors. Consistent with this possibility, we found that of the 113 human PDACs that were tested, 86 (76%) were positive for bacteria, mainly Gammaproteobacteria. <<<
翻译
1023.
cellsarts (2022-03-31 14:25):
#paper  https://doi.org/10.1016/j.csbj.2021.03.019  title:Computational  prediction of secreted  proteins in  gram-negative bacteria abstract:Gram-negative bacteria harness multiple protein secretion systems and secrete a large proportion of the proteome. Proteins can be exported to periplasmic space, integrated into membrane, transported into extracellular milieu, or translocated into cytoplasm of contacting cells. It is important for accurate, genome-wide annotation of the secreted proteins and their secretion pathways. In this review, we systematically classified the secreted proteins according to the types of secretion systems in Gram- negative bacteria, summarized the known features of these proteins, and reviewed the algorithms and tools for their prediction. 题目:革兰氏阴性菌分泌蛋白的预测 摘要:革兰氏阴性菌控制多种蛋白质分泌系统,并分泌大量的蛋白质。蛋白质可以被输出到胞外周质空间,整合到细胞膜,运输到细胞外环境,或转运到接触细胞的细胞质中。其分泌蛋白的准确预测和分类对于细菌基因组的解读和细菌毒力、耐药等重要生物表型的分子机制研究都具有重要意义。全基因组注释的分泌蛋白质及其分泌途径非常的重要。本文根据革兰氏阴性菌分泌系统的类型,对革兰氏阴性菌分泌蛋白进行了系统分类,总结了这些蛋白的已知特征,并对其预测算法和工具进行了综述。在这篇综述中,总结了革兰氏阴性菌的蛋白质分泌系统和预测这些分泌蛋白的生物信息学工具。首先,计算科学家和实验生物学家之间经常存在差距。尽管开发人员证明了软件工具的高准确性,但基于非同源的效应预测器(特别是T3SEs、T4SEs和T6SEs)还很少被湿实验室研究者成功地应用于识别新的效应器。更多的热情被投入到新的算法而不是生物方面,例如新功能。大多数效应预测工具都是通用的,没有考虑特定的生物先验信息,如物种、分泌系统亚型和调节管道特异性。
Abstract:
Gram-negative bacteria harness multiple protein secretion systems and secrete a large proportion of the proteome. Proteins can be exported to periplasmic space, integrated into membrane, transported into extracellular milieu, or … >>>
Gram-negative bacteria harness multiple protein secretion systems and secrete a large proportion of the proteome. Proteins can be exported to periplasmic space, integrated into membrane, transported into extracellular milieu, or translocated into cytoplasm of contacting cells. It is important for accurate, genome-wide annotation of the secreted proteins and their secretion pathways. In this review, we systematically classified the secreted proteins according to the types of secretion systems in Gram-negative bacteria, summarized the known features of these proteins, and reviewed the algorithms and tools for their prediction. <<<
翻译
1024.
龙海晨 (2022-03-31 12:25):
#paper doi: 10.1186/s12935-022-02506-0 Cancer Cell International (2022) 22:94 HPV16 E6 gene polymorphisms and the functions of the mutation site in cervical cancer among Uygur ethnic and Han nationality women in Xinjiang, China 三月比较忙,用自己课题组的文献发过来应该也符合要求吧,用生物信息学技术和细胞生物学,分子生物学实验。文章探寻维吾尔族和汉族妇女感染HPV的基因型分布。分析高危型病毒HPV16E6基因多态性位点及其与宫颈癌发生发展的关系。使用欧洲标准原型对HPV16 E6序列进行进化树分析,PV16 E6-T295/T350、G295/G350和T295/G350 GV230载体稳定转染宫颈癌C33A细胞,通过CCK8和克隆形成试验、转移膨胀和细胞划痕试验、流式细胞仪试验分析细胞增殖、迁移和侵袭、凋亡。 研究结果:1. 2879人中HPV总感染率为26.390%(760/2879),维吾尔族为22.87%(196/857),汉族为27.89%(564/2022)(P<0.05)。 2.在110个突变中,65例E6基因在核苷酸350(T350G)处发生突变亮氨酸变成缬氨酸(L83V)。此外,还有7例E6基因在295核苷酸处发生突变(T295G),天冬氨酸转变为谷氨酸(D64E)。 3.当突变位点的E6载体被转染到C33A中时,它们能促进细胞增殖、迁移、侵袭,并抑制细胞凋亡。T295/G350-E6为阳性显著强于G295/G350和T295/T350,G295/G350显著强于T295/T350(P<0.05)。T295/G350对C33A细胞的作用最强,G295/G350明显强于T295/T350(P<0.05)。 结论:1.中国新疆维吾尔族和汉族的HPV阳性感染率不同,感染的基因型分布也不同。 2.用不同的真核表达载体转染C33A细胞后, T295/G350比G295/G350突变位点在更大程度上促进了C33A细胞的增殖、迁移和侵袭,G295/G350比T295/T350有更强的效果。
IF:5.300Q1 Cancer cell international, 2022-Feb-22. DOI: 10.1186/s12935-022-02506-0 PMID: 35193568 PMCID:PMC8862000
Abstract:
BACKGROUND: To investigate the genotype distribution of human papillomavirus (HPV) in infected Uygur and Han women in Xinjiang, China; analyze the HPV16 E6 gene polymorphism site and relationship with the … >>>
BACKGROUND: To investigate the genotype distribution of human papillomavirus (HPV) in infected Uygur and Han women in Xinjiang, China; analyze the HPV16 E6 gene polymorphism site and relationship with the development of cervical cancer.METHODS: The HPV16 E6 sequence was analyzed using the European standard prototype to perform an evolutionary tree. HPV16 E6-T295/T350, G295/G350, and T295/G350 GV230 vectors were stably transfected into cervical cancer C33A cells to analyze the cell proliferation, migration and invasion, apoptosis by CCK8 and clonogenic assays, transwell and cell scratch assays, FACS experiments.RESULTS: The total HPV infection rate was 26.390% (760/2879), whereas the Uygur 22.87% (196/857) and the Han was 27.89% (564/2022) (P < 0.05). Among 110 mutations, 65 cases of E6 genes were mutated at nucleotide 350 (T350G) with the leucine changing to valine (L83V). Moreover, there were 7 cases of E6 gene mutated at nucleotide 295 (T295G) with aspartic changing to glutamic (D64E). When E6 vector(s) of mutations sites were transfected into C33A cells, they were found to promote cellular proliferation, migration, invasion, and inhibit apoptosis. T295/G350-E6 was significantly stronger than G295/G350 and T295/T350, G295/G350 was significantly stronger than T295/T350 (P < 0.05). The T295/G350 had the strongest effect on C33A cells and G295/G350 was significantly stronger than T295/T350 (P < 0.05).CONCLUSIONS: The positive HPV infection rates differed between the Uygur and Han in Xinjiang, China, and the genotype distribution of infection was different. After transfecting C33A cells with different eukaryotic expression vectors, the T295/G350 mutation site promoted the proliferation, migration, and invasion of C33A cells to a greater extent than G295/G350; however, G295/G350 had a stronger effect than T295/T350. <<<
翻译
1025.
Vincent (2022-03-31 11:11):
#paper doi: 10.1186/s13059-021-02443-7 Genome Biol 2021 Technology dictates algorithms: recent developments in read alignment. 序列比对是生物信息测序数据分析的基础步骤,这篇文章详细回顾了107种序列比对软件,并且通过实验评估了其中的11种软件的计算效率和速度。文章中提到序列比对算法和测序技术是共同进化的(co-evolution),一种新技术的诞生能带来了一系列工具的开发,而底层的核心算法往往没有很大的革命性的改变(只不过是tailored for the new technology)。文章调查发现基于哈希表index基因组的方法是最常见的,但是缺点是对存储空间的要求较大,基于suffix-tree的index方法往往计算速度也较快并且被越来越广泛的使用。另一方面,文章也发现,局部序列比对方法通常使用海明距离(hamming distance)和smith-waterman算法来寻找测序片段在基因组中的确切位置。此外文章还回顾了长序列读长对序列比对方法开发的影响等等。
IF:10.100Q1 Genome biology, 2021-08-26. DOI: 10.1186/s13059-021-02443-7 PMID: 34446078
Abstract:
Aligning sequencing reads onto a reference is an essential step of the majority of genomic analysis pipelines. Computational algorithms for read alignment have evolved in accordance with technological advances, leading … >>>
Aligning sequencing reads onto a reference is an essential step of the majority of genomic analysis pipelines. Computational algorithms for read alignment have evolved in accordance with technological advances, leading to today's diverse array of alignment methods. We provide a systematic survey of algorithmic foundations and methodologies across 107 alignment methods, for both short and long reads. We provide a rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read alignment. We discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology. <<<
翻译
1026.
李翛然 (2022-03-31 00:44):
#paper doi: 10.1038/s41586-022-04654-9 Nature,2022 Design of protein binding proteins from target structure alone. 这篇文章我是一定要吐槽一下的!!!!上周5居然被Nature 接受了?!?!这个是我最我无法理解的,去年DeepMind投了预印本开始,我们就开始跟踪这个文章了。其中的所有方法我们已经复现并加以改进,但是团队的所有人都不认为应该被Nature接收。 原因是以下几点: 1,其原理非常简单易懂,就是利用现有的一些氨基酸序列,逐渐地解析靶点结构,然后拼出来新的氨基酸序列。 2,根据AlphaFlod将靶点结构拆解出来,找到相关的合适位置,然后通过检索的方式找到合适的小氨基酸序列(这一步也没有问题,AI生成模型也会这么做) 3,但是下一步就太扯了!因为最关键的步骤来了,就是如何评判找到和生成的氨基酸与靶点的对接亲和力?以及如何评价对接强度? 也是强化学习的关键Q函数到底是啥 他居然用了DeepMind 和华盛顿大学的历史遗留工具集:RoseTTA!!!!最最关键的评分函数居然用自己团队曾经的开源工具集!(大分子准确度也就撑死20%不到) 太不可思议了!!完全没有试验验证和支持的文章居然被Nature 主刊接收了?!?!天啊,这可和ALphaFLod开创性是比不了的,人家是引入了全新的数学工具和解决问题的思路,这文章完全是蹭出来的。 只能说Google,DeepMind 以及华盛顿大学 背后的学术公关和关系网太庞大了! 不过另一方面,只能说的是,生物学过去的发展太慢了,AI行业内卷外溢之后,真的是降维打击!
IF:50.500Q1 Nature, 2022-05. DOI: 10.1038/s41586-022-04654-9 PMID: 35332283
Abstract:
The design of proteins that bind to a specific site on the surface of a target protein using no information other than the three-dimensional structure of the target remains a … >>>
The design of proteins that bind to a specific site on the surface of a target protein using no information other than the three-dimensional structure of the target remains a challenge. Here we describe a general solution to this problem that starts with a broad exploration of the vast space of possible binding modes to a selected region of a protein surface, and then intensifies the search in the vicinity of the most promising binding modes. We demonstrate the broad applicability of this approach through the de novo design of binding proteins to 12 diverse protein targets with different shapes and surface properties. Biophysical characterization shows that the binders, which are all smaller than 65 amino acids, are hyperstable and, following experimental optimization, bind their targets with nanomolar to picomolar affinities. We succeeded in solving crystal structures of five of the binder-target complexes, and all five closely match the corresponding computational design models. Experimental data on nearly half a million computational designs and hundreds of thousands of point mutants provide detailed feedback on the strengths and limitations of the method and of our current understanding of protein-protein interactions, and should guide improvements of both. Our approach enables the targeted design of binders to sites of interest on a wide variety of proteins for therapeutic and diagnostic applications. <<<
翻译
1027.
张贝 (2022-03-31 00:05):
#paper doi: 10.1038/s41586-021-03828-1 Nature, 2021, Highly accurate protein structure prediction for the human proteome. AlphaFold2是由DeepMind公司开发的人工智能系统,能够基于氨基酸序列,精确预测蛋白质的3D结构。预测的准确性可以与使用冷冻电镜、X射线衍射等手段解析的3D结构相媲美。AlphaFold2与基础版本相比,在蛋白结构解析的速度方面提升约16倍。本文利用AlphaFold2对98.5%的人类蛋白进行结构预测,并将预测的结果免费向公众开放。AlphaFold2能对人类蛋白质组58%的氨基酸的结构位置给出可信预测,且能对蛋白复合体的结构进行较好预测,其中低置信度的预测结果可能代表蛋白结构的无序状态。AlphaFold的出现代表人工智能驱动的生物学研究时代的来临。
IF:50.500Q1 Nature, 2021-08. DOI: 10.1038/s41586-021-03828-1 PMID: 34293799 PMCID:PMC8387240
Abstract:
Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of … >>>
Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure. Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold, at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective. <<<
翻译
1028.
魏魏魏 (2022-03-30 00:08):
#paper doi:10.1080/02796015.2008.12087908 School Psychology Review, (2008), Looking beyond psychopathology: The dual-factor model of mental health in youth. 这是一篇比较老的文献了,关于心理健康的,是一个基于新的心理健康概念展开的研究。此前人们认为的心理健康大多都是基于精神病理学中内化或外化问题指标的,比如抑郁、焦虑或行为问题等。但是,这种评估心理健康的方法容易夸大或者忽视人们的心理健康问题。在积极心理学影响下,心理健康双因素模型提出了新的心理健康观念,即,评估人们的心理健康时,既要看到传统的精神病理学指标的状况,也要看到人们的积极方面的水平,比如幸福体验。这样,心理健康就包括了积极因素和消极因素两个方面,而且,基于两个因素可以将人群区分出健康特征不同的四类人群:低疾病症状且高幸福感的完全心理健康组,低疾病症状且低幸福感的部分心理健康组,高疾病症状且高幸福感的部分疾病组,高疾病症状且低幸福感的完全疾病组,而且,通常,部分心理健康组被称为易感人群。当前研究基于10到16岁美国东南地区的学生进行,结果得到了占比不同的四类人群,完全心理健康组只占到了57%,而且,四个人群在学业表现、身体健康和社会功能等方面存在差异,而且,完全心理健康组在学业表现、学校出勤、阅读技能和学业相关目标等方面均优于其他三个人群。 这个心理健康概念和评估的方法对心理健康的评估工作和学校教育等领域具有重要意义。
Abstract:
In a dual-factor model of mental health (cf. Greenspoon & Saklofske, 2001), assessments of positive indicators of wellness (i.e., subjective well-being—SWB) are coupled with traditional negative indicators of illness (i.e., … >>>
In a dual-factor model of mental health (cf. Greenspoon & Saklofske, 2001), assessments of positive indicators of wellness (i.e., subjective well-being—SWB) are coupled with traditional negative indicators of illness (i.e., psychopathology) to comprehensively measure mental health. The current study examined the existence and utility of a dual-factor model in early adolescence. The SWB, psychopathology, academic functioning, social adjustment, and physical health of a general sample of 349 middle school students was assessed via self-report scales, school records, and teacher reports regarding students' externalizing psychopathology. The existence of a dual-factor model was supported through the identification of four mental health groups: 57% of the sample had complete mental health, 13% was vulnerable, 13% was symptomatic but content, and 17% was troubled. The means of the four groups differed significantly in terms of academic outcomes, physical health, and social functioning. Results support the importance of high SWB to optimal functioning during adolescence, as students with complete mental health (i.e., high SWB, low psychopathology) had better reading skills, school attendance, academic self-perceptions, academic-related goals, social support from classmates and parents, self-perceived physical health, and fewer social problems than their vulnerable peers also without clinical levels of mental illness but with low SWB. Among students with clinical levels of psychopathology, students with high SWB (symptomatic but content youth) perceived better social functioning and physical health. <<<
翻译
1029.
na na na (2022-03-29 23:30):
#paper Zhu T, Liu J, Beck S, Pan S, Capper D, Lechner M, Thirlwell C, Breeze CE, Teschendorff AE. A pan-tissue DNA methylation atlas enables in silico decomposition of human tissue methylomes at cell-type resolution. Nat Methods. 2022 Mar;19(3):296-306. doi: 10.1038/s41592-022-01412-7. Epub 2022 Mar 11. PMID: 35277705; PMCID: PMC8916958. 该文是3月19刚发表在Nat Methods上的一篇文章,文章主要讲的是利用组织特异性单细胞RNA 测序数据集的高分辨率特性构建了针对13种实体组织类型和40种细胞类型定义的DNA甲基化图谱,简单来说就是构建了一个利用DNA甲基化变异解析多种组织中细胞类型。目前单细胞测序主要还是以RNA表达谱为主,因此如何通过甲基化测序来准确预测组织中各种细胞类型还待研究。虽然已经有一些算法例如MehylCIBERSORT,其原理如其名字一样,都是借鉴CIBERSORT的反卷积算法,但根据其原理,只能计算成纤维细胞以及7种免疫细胞的甲基化谱,但不同肿瘤类型的组织中实际情况是更加复杂的。本文作者从多个不同肿瘤组织的单细胞测序数据出发,细胞的marker基因的mRNA表达量与其启动子区域的甲基化成显著反比的位点来定义甲基化marker。可以准确在13种组织类型和40种细胞的高分辨率DNA甲基化图谱。作者基于不同组织的中特异的细胞类型结果,分别做了验证,并且在具体的临床问题(神经细胞瘤和2期黑色素瘤的新预后关联)上,也都有良好的表现。最后作者提供了上述表达谱计算R包,并且该R包也能通过自测数据,在新的组织上构建起特异的细胞类型:https://github.com/ww880412/RPresto ; 遗憾的是,我没成功安装上还,缺少依赖包“presto”。但未找到该包,只有一个RPresto,装上后依然报错,待解决中;
IF:36.100Q1 Nature methods, 2022-03. DOI: 10.1038/s41592-022-01412-7 PMID: 35277705 PMCID:PMC8916958
Abstract:
Bulk-tissue DNA methylomes represent an average over many different cell types, hampering our understanding of cell-type-specific contributions to disease development. As single-cell methylomics is not scalable to large cohorts of … >>>
Bulk-tissue DNA methylomes represent an average over many different cell types, hampering our understanding of cell-type-specific contributions to disease development. As single-cell methylomics is not scalable to large cohorts of individuals, cost-effective computational solutions are needed, yet current methods are limited to tissues such as blood. Here we leverage the high-resolution nature of tissue-specific single-cell RNA-sequencing datasets to construct a DNA methylation atlas defined for 13 solid tissue types and 40 cell types. We comprehensively validate this atlas in independent bulk and single-nucleus DNA methylation datasets. We demonstrate that it correctly predicts the cell of origin of diverse cancer types and discovers new prognostic associations in olfactory neuroblastoma and stage 2 melanoma. In brain, the atlas predicts a neuronal origin for schizophrenia, with neuron-specific differential DNA methylation enriched for corresponding genome-wide association study risk loci. In summary, the DNA methylation atlas enables the decomposition of 13 different human tissue types at a high cellular resolution, paving the way for an improved interpretation of epigenetic data. <<<
翻译
1030.
白义民 (2022-03-28 12:52):
#paper 《法界宝藏论》,是大圆满法义七宝藏之一的核心精要,主要传讲描述法身界。因为在公开学术刊物印发,所以分享出来,让大家有一点基本概念,有啥问题,可以一起讨论。
Abstract:
《法界宝藏论》是藏传佛教大圆满法的精髓。论中以不住于相的语言,以清净自在、光明无相的胜义菩提心来阐述法界要义,界定了法界作为一切法的本初根源、一切因之因、一切果之果的地位,在此基础上建立起无修之道。论中以"任运成就"的修行之道,揭示了佛教的自由观、中道观;以"自然智慧"淋漓尽致地展现了佛教的智慧观;以自然圆满、轮涅一味的大圆满果位展现了佛教的生活观、解脱观、成就观。 >>>
《法界宝藏论》是藏传佛教大圆满法的精髓。论中以不住于相的语言,以清净自在、光明无相的胜义菩提心来阐述法界要义,界定了法界作为一切法的本初根源、一切因之因、一切果之果的地位,在此基础上建立起无修之道。论中以"任运成就"的修行之道,揭示了佛教的自由观、中道观;以"自然智慧"淋漓尽致地展现了佛教的智慧观;以自然圆满、轮涅一味的大圆满果位展现了佛教的生活观、解脱观、成就观。 <<<
翻译
1031.
小W (2022-03-26 17:10):
#paper https://doi.org/10.1038/s41573-021-00337-8 Nat Rev Drug Discov 21, 201–223 (2022) 本文是一篇对于减肥药的综述性论文,主要介绍一下几点。 1.肥胖是一种慢性退化性疾病,不仅仅源于缺乏自律。将肥胖定义为一种慢性疾病而不是一种多种疾病的风险因素,生活方式和行为干预提供了中等的疗效,通过增加药物和/或手术干预提供了更优秀的效果。2.减肥药物发展历史以及减肥药物伴随的异质性、多基因和代谢靶点等安全隐患。3.新型减肥药物介绍,截止于2020年下半年FDA获批的减肥药物治疗。4.基于异质性患者需求的精准用药。 管住嘴,迈开腿。有病治病,没病健身。
Abstract:
Enormous progress has been made in the last half-century in the management of diseases closely integrated with excess body weight, such as hypertension, adult-onset diabetes and elevated cholesterol. However, the … >>>
Enormous progress has been made in the last half-century in the management of diseases closely integrated with excess body weight, such as hypertension, adult-onset diabetes and elevated cholesterol. However, the treatment of obesity itself has proven largely resistant to therapy, with anti-obesity medications (AOMs) often delivering insufficient efficacy and dubious safety. Here, we provide an overview of the history of AOM development, focusing on lessons learned and ongoing obstacles. Recent advances, including increased understanding of the molecular gut-brain communication, are inspiring the pursuit of next-generation AOMs that appear capable of safely achieving sizeable and sustained body weight loss. <<<
翻译
1032.
尹志 (2022-03-25 14:10):
#paper doi:10.1109/CVPR.2015.7298682, 2015, FaceNet: A unified embedding for face recognition and clustering. 这是一篇人脸检测领域的经典论文。Google写的,发在2015年的CVPR上。在LFW数据集上刷到99.63%的分数,在YouTube Faces DB上也刷到95.12%,当时的SOTA。虽然讲的是人脸检测,但其思想适合于非常多的场景,包括各类图像识别问题,自然语言处理问题等。文章引入了一套端到端的训练方式,直接对嵌入空间进行建模。其想法非常直接,即通过嵌入空间建模,将每张人脸映射到嵌入空间的一个点。在这样的嵌入下,相同id的人脸应该接近,而不同id的人脸应该远离,那么这样的嵌入方式,可以理解成一个特征处理器,从而对后续人脸检测、识别、聚类等动作做出高效的预先计算。网络结构部分比较简单,主要用的是当时还很新鲜的inception网络,有趣的是它的loss,文章引入了triplet loss的概念,即anchor-pos对,anchor-neg对进行距离计算。其中anchor为某id对应图片,pos为该id对应的其它人脸图片,neg为非该id的人脸图片。思想很简单,就是通过训练,让anchor-pos对的距离很小,anchor-neg对的距离很大。这里的loss在数学上,就表示为anchor-pos对的距离-anchor-neg对的距离+alpha。这里的alpha可以理解为一个约束,其将同一个id的脸约束在一个流形上且保度规。当然,在实践训练中,triplet的选择也很重要,有兴趣的可以看paper。虽然文章比较老,所用的网络结构也很老,但是其简单的思想,有效的结果都给后续的很多识别工作,不论是研究还是工业实战层面带来巨大的启发。比如做word2vec的小伙伴肯定会心有戚戚焉。
1033.
十年 (2022-03-25 12:29):
#paper 10.1038/s41587-020-0740-8 Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra. 代谢组学数据处理中代谢物识别又一工具,归属于SIRIUS,主要还是碎片树策略。这次用DNN的方法做的模型,交叉验证准确率号称高达99.7%。质谱碎片预测这个东西,很多大佬都在做,但是准确率一直没有想象中的那么高,这几年借着机器学习的风口,希望能做的更好。
IF:33.100Q1 Nature biotechnology, 2021-04. DOI: 10.1038/s41587-020-0740-8 PMID: 33230292
Abstract:
Metabolomics using nontargeted tandem mass spectrometry can detect thousands of molecules in a biological sample. However, structural molecule annotation is limited to structures present in libraries or databases, restricting analysis … >>>
Metabolomics using nontargeted tandem mass spectrometry can detect thousands of molecules in a biological sample. However, structural molecule annotation is limited to structures present in libraries or databases, restricting analysis and interpretation of experimental data. Here we describe CANOPUS (class assignment and ontology prediction using mass spectrometry), a computational tool for systematic compound class annotation. CANOPUS uses a deep neural network to predict 2,497 compound classes from fragmentation spectra, including all biologically relevant classes. CANOPUS explicitly targets compounds for which neither spectral nor structural reference data are available and predicts classes lacking tandem mass spectrometry training data. In evaluation using reference data, CANOPUS reached very high prediction performance (average accuracy of 99.7% in cross-validation) and outperformed four baseline methods. We demonstrate the broad utility of CANOPUS by investigating the effect of microbial colonization in the mouse digestive system, through analysis of the chemodiversity of different Euphorbia plants and regarding the discovery of a marine natural product, revealing biological insights at the compound class level. <<<
翻译
1034.
张浩彬 (2022-03-25 11:30):
#paper 10.1360/SCM-2019-0368 气象调整下的区域空气质量评估 空气质量的有效评估是空气质量管理的重要方向。蓝天保卫战圆满收官,十四五规划中,空气改善目标依然包括了PM2.5以及空气质量的优良天数。参与过一些地方环保局在大气治理问题的一些问题处理。应该说空气质量水平一方面既受到污染物排放的影响,另一方面也受到气象因素的影响。如果能够有效摒除气象因素的干扰,那么在制定策略的时候就可能能够更加精准。回到本文,传统来说,我们固然可以通过对协变量的处理来进行单独分析。但是也会由于设置与协变量遵循了不同的基准分布有所不同也会带来偏差(毕竟气象条件是不能随机分配的)。本文提出一种在时间和空间两个维度对观测到的浓度中气象因素进行调整的新方法(非参数方法),相对于传统用的简单平均,这里结合了时间及空间因素进行调整(我觉得新,主要在于引入空间的处理。作者也证明了,带趋势分析的处理是本方法中的一个特例)。使得 调整后的均值可以在不同年份之间进行比较。作者对北京地区(覆盖范围包括扩展出去的河北)进行实证,发现SO2显著减少,PM25级NO2改善甚微,O3反而上升。(虽然是2020年论文,但数据截止到2017)。实证来看,整体应该是吻合的,臭氧水平的升高一个应该是因为北方地区相对臭氧问题不属于主要问题(这点与南方地区相反),另一点应该是毕竟蓝天多了,臭氧更容易生成。 最后感慨一下,即使是非参的方法,统计的论文(作者是陈松蹊院士和他的学生)相比cs的,大家都做模型,统计也更加关注估计和理论性质。
Abstract:
虽然空气污染是由污染物排放到大气中造成的,但是由于气象条件会影响污染物的扩散,因而实际观测到的污染水平会受到气象条件的影响.因此, 有效的空气质量管理要求污染评估指标和统计方法不受气象因素的干扰, 并能准确客观地反映污染物浓度的变化.为了评估北京地区潜在污染物排放的变化, 本文提出一种消除气象干扰的时空调整方法.通过控制气象条件, 调整后的污染物时空平均浓度可以捕捉到潜在排放量的变化.本文提出具体调整均值的方法, 并进行理论和数值分析,将此方法应用于北京地区的空气质量评估, 揭示一些有趣的模式和趋势, 这些结果可以用于空气质量评估和管理. >>>
虽然空气污染是由污染物排放到大气中造成的,但是由于气象条件会影响污染物的扩散,因而实际观测到的污染水平会受到气象条件的影响.因此, 有效的空气质量管理要求污染评估指标和统计方法不受气象因素的干扰, 并能准确客观地反映污染物浓度的变化.为了评估北京地区潜在污染物排放的变化, 本文提出一种消除气象干扰的时空调整方法.通过控制气象条件, 调整后的污染物时空平均浓度可以捕捉到潜在排放量的变化.本文提出具体调整均值的方法, 并进行理论和数值分析,将此方法应用于北京地区的空气质量评估, 揭示一些有趣的模式和趋势, 这些结果可以用于空气质量评估和管理. <<<
翻译
1035.
张德祥 (2022-03-24 23:05):
#paper https://doi.org/10.48550/arXiv.2112.14045 Learning from What’s Right and Learning from What’s Wrong 最新的贝叶斯推理论文,详见推文:https://mp.weixin.qq.com/s/OEcXvyqxYNTCbTK7KUrEjw
Abstract:
The concept of updating (or conditioning or revising) a probability distribution is fundamental in (machine) learning and in predictive coding theory. The two main approaches for doing so are called … >>>
The concept of updating (or conditioning or revising) a probability distribution is fundamental in (machine) learning and in predictive coding theory. The two main approaches for doing so are called Pearl's rule and Jeffrey's rule. Here we make, for the first time, mathematically precise what distinguishes them: Pearl's rule increases validity (expected value) and Jeffrey's rule decreases (Kullback-Leibler) divergence. This forms an instance of a more general distinction between learning from what's right and learning from what's wrong. The difference between these two approaches is illustrated in a mock cognitive scenario. <<<
翻译
1036.
洪媛媛 (2022-03-22 18:43):
#paper Hepatocellular Carcinoma Detection by Plasma Methylated DNA: Discovery, Phase I Pilot, and Phase II Clinical Validation. Hepatology. 2019 March ; 69(3): 1180–1192. doi:10.1002/hep.30244. 这篇文献讲述了肝癌早筛甲基化marker的筛选过程。实验技术:RRBS、Q-RealTime PCR以及升级版的Q-RealTime PCR(TELQAS)。甲基化marker的筛选过程:1,dicovery/Technical validation, RRBS筛选肝癌组织 VS 正常肝组织,以及肝癌组织 VS血细胞差异的甲基化marker作为备选marker,然后测试备选marker在QPCR平台表现;2,Biological tissue validation作为独立的验证集,实验材料是肝癌组织 VS 正常肝组织,方法是QPCR;3,Phase 1 plasma study,在肝癌cfDNA和健康人cfDNA中测试上一步的甲基化marker,技术是TELQAS;4,Phase II plasma study作为血浆的独立验证集,技术是TELQAS。最终筛选得到6-marker panel (HOXA1, EMX1, AK055957, ECE1, PFKP and CLEC11A ), AUC 达到 0.96 (95% CI, 0.93–0.99) , 当健康人中特异性92% (86–96%)时,肝癌病人检测灵敏度 95% (88–98%) 。
Abstract:
Early detection improves hepatocellular carcinoma (HCC) outcomes, but better noninvasive surveillance tools are needed. We aimed to identify and validate methylated DNA markers (MDMs) for HCC detection. Reduced representation bisulfite … >>>
Early detection improves hepatocellular carcinoma (HCC) outcomes, but better noninvasive surveillance tools are needed. We aimed to identify and validate methylated DNA markers (MDMs) for HCC detection. Reduced representation bisulfite sequencing was performed on DNA extracted from 18 HCC and 35 control tissues. Candidate MDMs were confirmed by quantitative methylation-specific PCR in DNA from independent tissues (74 HCC, 29 controls). A phase I plasma pilot incorporated quantitative allele-specific real-time target and signal amplification assays on independent plasma-extracted DNA from 21 HCC cases and 30 controls with cirrhosis. A phase II plasma study was then performed in 95 HCC cases, 51 controls with cirrhosis, and 98 healthy controls using target enrichment long-probe quantitative amplified signal (TELQAS) assays. Recursive partitioning identified best MDM combinations. The entire MDM panel was statistically cross-validated by randomly splitting the data 2:1 for training and testing. Random forest (rForest) regression models performed on the training set predicted disease status in the testing set; median areas under the receiver operating characteristics curve (AUCs; and 95% confidence interval [CI]) were reported after 500 iterations. In phase II, a six-marker MDM panel (homeobox A1 [HOXA1], empty spiracles homeobox 1 [EMX1], AK055957, endothelin-converting enzyme 1 [ECE1], phosphofructokinase [PFKP], and C-type lectin domain containing 11A [CLEC11A]) normalized by beta-1,3-galactosyltransferase 6 (B3GALT6) level yielded a best-fit AUC of 0.96 (95% CI, 0.93-0.99) with HCC sensitivity of 95% (88%-98%) at specificity of 92% (86%-96%). The panel detected 3 of 4 (75%) stage 0, 39 of 42 (93%) stage A, 13 of 14 (93%) stage B, 28 of 28 (100%) stage C, and 7 of 7 (100%) stage D HCCs. The AUC value for alpha-fetoprotein (AFP) was 0.80 (0.74-0.87) compared to 0.94 (0.9-0.97) for the cross-validated MDM panel (P < 0.0001). Conclusion: MDMs identified in this study proved to accurately detect HCC by plasma testing. Further optimization and clinical testing of this promising approach are indicated. <<<
翻译
1037.
笑对人生 (2022-03-21 23:28):
#paper Transcriptional census of epithelial-mesenchymal plasticity in cancer. Sci Adv. 2022 Jan 7;8(1):eabi7640. doi: 10.1126/sciadv.abi7640 细胞的上皮间充质可塑性是指细胞具有在上皮细胞和间充质细胞两种细胞形态相互转化的能力,它描述的是细胞对周围复杂微环境做出响应后的一种动态混合形态。EMP包含两种重要的进程,分别为EMT和MET,其中EMT与原发灶肿瘤细胞远端转移和肿瘤细胞干性等相关,而MET与肿瘤细胞在转移病灶定植有关。该篇文章首先对17项已发表研究的单细胞转录组测序数据进行收集,涉及8种类型的癌种,总共266份肿瘤组织样本,223,501个细胞。接着利用这些数据对已报道的328个与EMP相关的基因进行重新定义,最终筛选到含有128个基因的基因集,并以此构建一个肿瘤细胞特异的EMP signature。同时也发现EMP是瘤内异质性发生的来源之一,并有高度的环境依赖性。利用TCGA的泛癌RNAseq数据发现EMP sigature的激活与更短EPI(无进展时间)相关,以及更强免疫抑制微环境相关。最后,作者还探究这个EMP的互作转录因子、MEK抑制剂和TGF-βR1抑制剂的关系。整篇文章属于纯数据挖掘,但没有高深的公式和构建机器学习模型,纯粹是基于生物学原理的逻辑进行推导和探究,具有一定的借鉴意义。
IF:11.700Q1 Science advances, 2022-Jan-07. DOI: 10.1126/sciadv.abi7640 PMID: 34985957
Abstract:
Epithelial-mesenchymal plasticity (EMP) contributes to tumor progression, promoting therapy resistance and immune cell evasion. Definitive molecular features of this plasticity have largely remained elusive due to the limited scale of … >>>
Epithelial-mesenchymal plasticity (EMP) contributes to tumor progression, promoting therapy resistance and immune cell evasion. Definitive molecular features of this plasticity have largely remained elusive due to the limited scale of most studies. Leveraging single-cell RNA sequencing data from 266 tumors spanning eight different cancer types, we identify expression patterns associated with intratumoral EMP. Integrative analysis of these programs confirmed a high degree of diversity among tumors. These diverse programs are associated with combinations of various common regulatory mechanisms initiated from cues within the tumor microenvironment. We show that inferring regulatory features can inform effective therapeutics to restrict EMP. <<<
翻译
1038.
思考问题的熊 (2022-03-20 16:35):
#paper Li, Yumei, Xinzhou Ge, Fanglue Peng, Wei Li, and Jingyi Jessica Li. “Exaggerated False Positives by Popular Differential Expression Methods When Analyzing Human Population Samples.” Genome Biology 23, no. 1 (March 15, 2022): 79. https://doi.org/10.1186/s13059-022-02648-4. 前几天发表在 Genome Biology 的一篇论文,算是比较严谨地论证了在大样本量RNA-seq差异分析时,今后即便不考虑速度因素,也应该抛弃DEseq2和edgeR转而使用朴实无华的Wilcoxon秩和检验。 更具体的内容已经写成推送发出来了,感兴趣可以再看看。
IF:10.100Q1 Genome biology, 2022-03-15. DOI: 10.1186/s13059-022-02648-4 PMID: 35292087 PMCID:PMC8922736
在分析人类群体样本时,流行的差异表达方法夸大了假阳性
Abstract:
When identifying differentially expressed genes between two conditions using human population RNA-seq samples, we found a phenomenon by permutation analysis: two popular bioinformatics methods, DESeq2 and edgeR, have unexpectedly high … >>>
When identifying differentially expressed genes between two conditions using human population RNA-seq samples, we found a phenomenon by permutation analysis: two popular bioinformatics methods, DESeq2 and edgeR, have unexpectedly high false discovery rates. Expanding the analysis to limma-voom, NOISeq, dearseq, and Wilcoxon rank-sum test, we found that FDR control is often failed except for the Wilcoxon rank-sum test. Particularly, the actual FDRs of DESeq2 and edgeR sometimes exceed 20% when the target FDR is 5%. Based on these results, for population-level RNA-seq studies with large sample sizes, we recommend the Wilcoxon rank-sum test. <<<
翻译
当使用人类群体 RNA-seq 样本鉴定两种情况之间的差异表达基因时,我们通过排列分析发现了一个现象:两种流行的生物信息学方法 DESeq2 和 edgeR 具有出乎意料的高错误发现率。将分析扩展到 limma-voom、NOISeq、dearseq 和 Wilcoxon 秩和检验,我们发现除了 Wilcoxon 秩和检验外,FDR 控制经常失败。特别是,当目标 FDR 为 5% 时,DESeq2 和 edgeR 的实际 FDR 有时会超过 20%。基于这些结果,对于样本量较大的群体水平 RNA-seq 研究,我们建议使用 Wilcoxon 秩和检验。
1039.
颜林林 (2022-03-20 16:16):
#paper doi:10.1101/2022.03.14.22272390 medRxiv, 2022, AI-Augmented Clinical Decision Support in a Patient-Centric Precision Oncology Registry. 这篇文章介绍了一项由人工智能技术辅助开展肿瘤精准诊疗的工作。文章作者来自xCures公司(成立于2018年),通讯作者Jeff Shrager是该公司的创始人,同时也是斯坦福的客座教授。实现类似目的的医学专家系统,至少可以追溯至上世纪七十年代,到如今即使看似AI技术早已渗透至医疗诸多领域,但却依然缺乏卓有成效的医疗决策通用解决方案,随着IBM Watson折戟,更让人们意识到这个伟大梦想所遭遇的种种现实困难。本文介绍了xCures公司开展的一项临床试验(NCT03793088),该临床试验建立了一个在线平台XCELSIOR,用于登记癌症患者信息,使用NLP等技术对患者病历数据进行格式化和标准化,号称“以患者为中心”,应该是形成类似于患者大健康病历记录,再结合各类公共数据库资源及其他信息,形成针对患者的可选诊疗方案的推荐及排序,并通过由分子药理学家和肿瘤专家组成的虚拟肿瘤委员会 (VTB) 团队进行人工审查,将结果提供给医生和患者,用于指导后续治疗方案决策。该临床试验面向难治性或晚期癌症患者,预计入组1万人(起始于2019年,预计2024年完成),目前已入组2千多人,并以每周约15例患者的速度在持续。其目标正如该公司官网上宣称的“uses artificial intelligence (A.I.) and predictive modeling to identify and rank the most promising treatment options for people with cancer who have exhausted the standard of care”。我个人相信,类似工作在全球各地肯定并非屈指可数,这些工作在未来必然会体现出难以估量的价值,但价值究竟如何体现,目前尚不明朗,即使这篇文章也未呈现出其独特优势所在,临床试验的评判终点也语焉不详,更多信息还有待继续观察。不过从本文及其补充材料的详细介绍看,该工作的工程意义更胜于科学意义,而这篇preprint更多可能是宣传价值。该工作是否会给其公司形成足够盈利也很难说,但能够以正式临床试验的形式开展,看似认真地建立体系并执行,其中细节应该也还是值得关注和学习的。
1040.
大象城南 (2022-03-14 11:32):
#paper doi:10.1186/s12938-020-00786-z 这篇文章主要介绍了被试间浅表层短联络纤维束自动聚类和标记算法。我们知道目前弥散加权磁共振成像是唯一能在活体状态下检测大脑白质纤维束走向的一种技术,以往的大部分脑白质纤维束追踪主要关注在深部走行的白质束,这些深层白质具有比较高的解剖一致性,被试间变异性较小,因此成熟的纤维束追踪算法和聚类算法可以很好地将深层白质分割成不同解剖位置的纤维束。基于深层白质的一些列研究(如脑发育,脑疾病异常的研究)均已经取得了很多突破和进展。然而浅表层纤维束由于其解剖结构比较复杂(大脑皮层有很多的沟回褶皱),且不同人大脑皮层形态差异性较大。因此常规的在深层白质追踪的算法直接套用在浅表层纤维束追踪往往是不合适的,且假阳性较高。本文基于匈牙利算法和Quick Bundle算法,对20个被试的dMRI进行浅表层纤维束追踪,并且建立了自动纤维聚类的方法,使得未来对浅表层白质纤维的挖掘提供了更精准的算法。他们的结果表明匈牙利算法虽然聚类后的质量较高,但是可重复性较差,而Quick Bundle算法具有较高的可重复性,能比较好地刻画群组之间的浅表层纤维束解剖特点。
Abstract:
BACKGROUND: Diffusion MRI is the preferred non-invasive in vivo modality for the study of brain white matter connections. Tractography datasets contain 3D streamlines that can be analyzed to study the … >>>
BACKGROUND: Diffusion MRI is the preferred non-invasive in vivo modality for the study of brain white matter connections. Tractography datasets contain 3D streamlines that can be analyzed to study the main brain white matter tracts. Fiber clustering methods have been used to automatically group similar fibers into clusters. However, due to inter-subject variability and artifacts, the resulting clusters are difficult to process for finding common connections across subjects, specially for superficial white matter.METHODS: We present an automatic method for labeling of short association bundles on a group of subjects. The method is based on an intra-subject fiber clustering that generates compact fiber clusters. Posteriorly, the clusters are labeled based on the cortical connectivity of the fibers, taking as reference the Desikan-Killiany atlas, and named according to their relative position along one axis. Finally, two different strategies were applied and compared for the labeling of inter-subject bundles: a matching with the Hungarian algorithm, and a well-known fiber clustering algorithm, called QuickBundles.RESULTS: Individual labeling was executed over four subjects, with an execution time of 3.6 min. An inspection of individual labeling based on a distance measure showed good correspondence among the four tested subjects. Two inter-subject labeling were successfully implemented and applied to 20 subjects and compared using a set of distance thresholds, ranging from a conservative value of 10 mm to a moderate value of 21 mm. Hungarian algorithm led to a high correspondence, but low reproducibility for all the thresholds, with 96 s of execution time. QuickBundles led to better correspondence, reproducibility and short execution time of 9 s. Hence, the whole processing for the inter-subject labeling over 20 subjects takes 1.17 h.CONCLUSION: We implemented a method for the automatic labeling of short bundles in individuals, based on an intra-subject clustering and the connectivity of the clusters with the cortex. The labels provide useful information for the visualization and analysis of individual connections, which is very difficult without any additional information. Furthermore, we provide two fast inter-subject bundle labeling methods. The obtained clusters could be used for performing manual or automatic connectivity analysis in individuals or across subjects. <<<
翻译
回到顶部