来自用户 颜林林 的文献。
当前共找到 127 篇文献分享,本页显示第 81 - 100 篇。
81.
颜林林
(2022-07-10 09:00):
#paper doi:10.1109/TR.2022.3171220 IEEE Transactions on Reliability, 2022, Detecting C++ Compiler Front-End Bugs via Grammar Mutation and Differential Testing. 这篇来自大连理工大学的文章,设计了一套名为CCoft的软件框架,用以自动识别C++编译器前端部分的bug。编译器的内部结构,通常按流程分为两部分,前端和后端,前端是从C++源代码识别语义、并将其转化为中间语言的阶段,后端则是根据中间语言生成机器代码的步骤。本文仅针对前端部分。本文的框架,首先将C++语法转换为一种结构化格式,然后使用“突变”的方式,来生成大批量的各种C++代码,其中包括符合语法的,也包括不符合语法的,目的是覆盖尽可能多的代码场景,用以挑战C++编译器,看编译器是否能够符合预期地进行处理。之后,将代码丢给编译器,根据编译器的输出信息,评判是否得到了正确处理,从而识别出一系列软件bug,包括:错误拒绝了合法代码、错误接受了不合法代码、代码语义处理错误、代码编译执行崩溃、代码编译时间过长而超时等。通过使用主流编译器GCC和Clang进行测试,在三个月内找到了136个编译器bug,对比市面上主流的工具,有大幅提升。
Abstract:
C++ is a widely used programming language and the C++ front-end is a critical part of a C++ compiler. Although many techniques have been proposed to test compilers, few studies …
>>>
C++ is a widely used programming language and the C++ front-end is a critical part of a C++ compiler. Although many techniques have been proposed to test compilers, few studies are devoted to detecting bugs in C++ compiler. In this study, we take the first step to detect bugs in C++ compiler front-ends. To do so, two main challenges need to be addressed, namely, the acquisition of test programs that are more likely to trigger bugs in compiler front-ends and the bug identification from complicated compiler outputs. In this article, we propose a novel framework named Ccoft to detect bugs in C++ compiler front-ends. To address the first challenge, Ccoft implements a practical program generator. The generator first transforms C++ grammars into a flexible structured format and then utilizes an equal-chance selection (ECS) strategy to conduct structure-aware grammar mutation to generate diverse C++ programs. Next, Ccoft employs a set of differential testing strategies to identify various kinds of bugs in C++ compiler front-ends by comparing complex outputs emitted by C++ compilers, thus tackling the second challenge. Empirical evaluation results over two mainstream compilers (i.e., GCC and Clang) show that Ccoft greatly improves two state-of-the-art approaches (i.e., Dharma and Grammarinator) by 135% and 111% in terms of the numbers of detected bugs, respectively. By running Ccoft for three months, we have successfully reported 136 bugs for two C++ compilers, of which 78 (57 confirmed, assigned, or fixed) for GCC and 58 (10 confirmed or fixed) for Clang.
<<<
翻译
82.
颜林林
(2022-07-09 07:36):
#paper doi:10.1186/s13073-022-01079-x Genome Medicine, 2022, Identification of a cytokine-dominated immunosuppressive class in squamous cell lung carcinoma with implications for immunotherapy resistance. 这是一篇纯数据挖掘的文章,试图回答肺鳞癌中免疫检查点抑制剂耐药的机制问题。文章通过收集了来自TCGA和GEO的624例肺鳞癌转录组数据,使用无监督聚类,从中识别出与 T 细胞衰竭特征、免疫抑制细胞、临床特征和免疫治疗反应相关的表达模式,并定义了一组衰竭免疫等级 (EIC) 的免疫抑制患者。这些患者占到28%至36%,尽管他们表现出高密度的肿瘤浸润淋巴细胞,却因显著富集、高比例的免疫抑制细胞、多个免疫检查点基因同时上调等特性,表现出对ICB的耐药性。相应的表达特征,在具有 ICB 治疗抗性的黑色素瘤患者中也得到印证。文章还检查了基因组和表观组的数据,发现这些患者呈现出较低的染色体突变负担和独特的甲基化模式。由此,作者还建立了一个在线网站,整合了用到的数据及分析方法,供研究人员使用多组学数据分析来研究 ICB 耐药性的潜在关联。从分析方法看,这篇文章的套路应该是比较常见的,算不上有什么创新性,不过在单病种上整合数据,并以在线网站的形式来使分析过程能够泛化并提供他人使用,也算是一类可行的生信“原创”工作吧。
Abstract:
BACKGROUND: Immune checkpoint blockade (ICB) therapy has revolutionized the treatment of lung squamous cell carcinoma (LUSC). However, a significant proportion of patients with high tumour PD-L1 expression remain resistant to …
>>>
BACKGROUND: Immune checkpoint blockade (ICB) therapy has revolutionized the treatment of lung squamous cell carcinoma (LUSC). However, a significant proportion of patients with high tumour PD-L1 expression remain resistant to immune checkpoint inhibitors. To understand the underlying resistance mechanisms, characterization of the immunosuppressive tumour microenvironment and identification of biomarkers to predict resistance in patients are urgently needed.METHODS: Our study retrospectively analysed RNA sequencing data of 624 LUSC samples. We analysed gene expression patterns from tumour microenvironment by unsupervised clustering. We correlated the expression patterns with a set of T cell exhaustion signatures, immunosuppressive cells, clinical characteristics, and immunotherapeutic responses. Internal and external testing datasets were used to validate the presence of exhausted immune status.RESULTS: Approximately 28 to 36% of LUSC patients were found to exhibit significant enrichments of T cell exhaustion signatures, high fraction of immunosuppressive cells (M2 macrophage and CD4 Treg), co-upregulation of 9 inhibitory checkpoints (CTLA4, PDCD1, LAG3, BTLA, TIGIT, HAVCR2, IDO1, SIGLEC7, and VISTA), and enhanced expression of anti-inflammatory cytokines (e.g. TGFβ and CCL18). We defined this immunosuppressive group of patients as exhausted immune class (EIC). Although EIC showed a high density of tumour-infiltrating lymphocytes, these were associated with poor prognosis. EIC had relatively elevated PD-L1 expression, but showed potential resistance to ICB therapy. The signature of 167 genes for EIC prediction was significantly enriched in melanoma patients with ICB therapy resistance. EIC was characterized by a lower chromosomal alteration burden and a unique methylation pattern. We developed a web application ( http://lilab2.sysu.edu.cn/tex & http://liwzlab.cn/tex ) for researchers to further investigate potential association of ICB resistance based on our multi-omics analysis data.CONCLUSIONS: We introduced a novel LUSC immunosuppressive class which expressed high PD-L1 but showed potential resistance to ICB therapy. This comprehensive characterization of immunosuppressive tumour microenvironment in LUSC provided new insights for further exploration of resistance mechanisms and optimization of immunotherapy strategies.
<<<
翻译
83.
颜林林
(2022-07-08 07:19):
#paper doi:10.1038/s41540-022-00233-w npj Systems Biology and Applications, Adaptive coding for DNA storage with high storage density and low coverage. 基于生物大分子(如DNA)实现大规模数据存储功能,是我个人比较感兴趣的方向之一。这几年在这个领域突然涌现了许多优秀文章,这可能与高通量测序技术发展,以及相关的合成生物学的进步有关。这篇来自大连理工的文章,也正是这样一个案例。本文提出了一种自适应编码DNA存储系统,针对不同的编码区域位置采用不同的编码方案,将 698 KB 大小的图像、视频和 PDF 文件存储在 DNA 中,之后又将其无损地解码还原为原始数据。相比过去同类工作,本文在编码数据过程中,更细致地设计了各种DNA分子特性及约束,使在保持碱基平衡和避免非特异性杂交的同时,能在尽量低测序深度下,对测序错误的噪声进行容错。将原始内容打散并接上索引片段,从而使所存储的内容可以通过特异性扩增并测序的方式进行随机读取。比较可惜的是,本文只做了理论上的模拟和探讨,尚未开展实际的DNA合成和测序,这大大削弱了文章的说服力。
IF:3.500Q1
NPJ systems biology and applications,
2022-07-04.
DOI: 10.1038/s41540-022-00233-w
PMID: 35788589
Abstract:
The rapid development of information technology has generated substantial data, which urgently requires new storage media and storage methods. DNA, as a storage medium with high density, high durability, and …
>>>
The rapid development of information technology has generated substantial data, which urgently requires new storage media and storage methods. DNA, as a storage medium with high density, high durability, and ultra-long storage time characteristics, is promising as a potential solution. However, DNA storage is still in its infancy and suffers from low space utilization of DNA strands, high read coverage, and poor coding coupling. Therefore, in this work, an adaptive coding DNA storage system is proposed to use different coding schemes for different coding region locations, and the method of adaptively generating coding constraint thresholds is used to optimize at the system level to ensure the efficient operation of each link. Images, videos, and PDF files of size 698 KB were stored in DNA using adaptive coding algorithms. The data were sequenced and losslessly decoded into raw data. Compared with previous work, the DNA storage system implemented by adaptive coding proposed in this paper has high storage density and low read coverage, which promotes the development of carbon-based storage systems.
<<<
翻译
84.
颜林林
(2022-07-07 07:41):
#paper doi:10.1186/s13059-022-02699-7 Genome Biology, 2022, Storing and analyzing a genome on a blockchain. 好几年前,我就听很多人说起,想把区块链技术用于基因组相关的应用,然而,后来各种结局惨淡,似乎都没了下文。在币圈跌跌不休一片哀嚎的最近,竟然《Genome Biology》上会发表出这么一篇文章,也真是神奇和亮眼。这篇来自耶鲁的文章,其全文和源码都是开放访问的,值得对区块链技术感兴趣的朋友仔细一读。文章设想了一个由测序仪、所有者、临床医生和研究人员组成的网络,每个人都参与同步 VCFchain 或 SAMchain,以此来形成分布式的数据共享,且数据分析过程也穿插在链的延伸过程中。在区块链有限的额外字节存储中,保存巨大的基因组数据,也确实需要一些技巧(如数据拆分和查询时的重新组合)加以实现,这篇文章也确实因此做了一些工作。但整体上还是有一种“为了区块链而区块链”的感觉。权限的管理和不容篡改可能是其特点和优势,但并未在文章中充分呈现,这与此前分享过的提及区块链技术的另外两篇文章有所不同(那两篇文章的DOI分别是:10.1038/s41591-022-01768-5 和 10.1038/s41586-021-03583-3,分别发表在 Nature Medicine 和 Nature,它们更多是AI算法及数据分享价值),而本文的重点还是在于区块链相关的程序实现细节。有这篇做铺垫,说不定类似文章后续真能冲击NBT呢。
IF:10.100Q1
Genome biology,
2022-06-29.
DOI: 10.1186/s13059-022-02699-7
PMID: 35765079
PMCID:PMC9241283
Abstract:
There are major efforts underway to make genome sequencing a routine part of clinical practice. A critical barrier to these is achieving practical solutions for data ownership and integrity. Blockchain …
>>>
There are major efforts underway to make genome sequencing a routine part of clinical practice. A critical barrier to these is achieving practical solutions for data ownership and integrity. Blockchain provides solutions to these challenges in other realms, such as finance. However, its use in genomics is stymied due to the difficulty in storing large-scale data on-chain, slow transaction speeds, and limitations on querying. To overcome these roadblocks, we developed a private blockchain network to store genomic variants and reference-aligned reads on-chain. It uses nested database indexing with an accompanying tool suite to rapidly access and analyze the data.
<<<
翻译
85.
颜林林
(2022-07-06 00:02):
#paper doi:10.1186/s12864-022-08717-z BMC Genomics, 2022, The effects of sequencing depth on the assembly of coding and noncoding transcripts in the human genome. 众所周知,测序深度会影响其数据的分析结果。然而,到底影响多大,怎么影响的,往往视研究目的和研究对象而定,得具体分析,也值得研究。这篇文章,就是在系统研究测序深度对转录组数据的转录本组装的影响。文章纳入了来自150个人类干细胞样本的不同细胞组织的RNA-seq数据,除了短读长平台外,还包括四个PacBio平台的长读长数据。其中有两个样本还测了高达200M reads的NGS数据量,于是可以用它们来抽取不同比例数据,以模拟不同的测序数据量。分析结果表明,编码转录本与非编码转录本之间存在差异,前者随着测序深度增加而迅速进入饱和,后者在所分析的数据中则几乎始终未达到饱和。这可能与两者的组装难度有关。此外,长读长信息有助于含有转座元件的转录本组装。比较有意思的是单细胞RNA-seq(scRNA-seq),其非编码转录本的表达水平低,是由于表达细胞较少,而在表达的细胞中,非编码转录本的表达水平其实与编码转录本相似,这个现象的发现得益于长读长测序平台,因此文章得出结论是长读长测序更适合scRNA-seq。但我个人多少还是怀疑这些结论很可能与分析评估方法有关,也许值得重复下这篇文章的分析过程。
Abstract:
Investigating the functions and activities of genes requires proper annotation of the transcribed units. However, transcript assembly efforts have produced a surprisingly large variation in the number of transcripts, and …
>>>
Investigating the functions and activities of genes requires proper annotation of the transcribed units. However, transcript assembly efforts have produced a surprisingly large variation in the number of transcripts, and especially so for noncoding transcripts. This heterogeneity in assembled transcript sets might be partially explained by sequencing depth. Here, we used real and simulated short-read sequencing data as well as long-read data to systematically investigate the impact of sequencing depths on the accuracy of assembled transcripts. We assembled and analyzed transcripts from 671 human short-read data sets and four long-read data sets. At the first level, there is a positive correlation between the number of reads and the number of recovered transcripts. However, the effect of the sequencing depth varied based on cell or tissue type, the type of read and the nature and expression levels of the transcripts. The detection of coding transcripts saturated rapidly with both short and long-reads, however, there was no sign of early saturation for noncoding transcripts at any sequencing depth. Increasing long-read sequencing depth specifically benefited transcripts containing transposable elements. Finally, we show how single-cell RNA-seq can be guided by transcripts assembled from bulk long-read samples, and demonstrate that noncoding transcripts are expressed at similar levels to coding transcripts but are expressed in fewer cells. This study highlights the impact of sequencing depth on transcript assembly.
<<<
翻译
86.
颜林林
(2022-07-05 00:03):
#paper doi:10.1093/database/baac049 Database, 2022, dbBIP: a comprehensive bipolar disorder database for genetic research. 这篇文章,正如其期刊名,是一个数据库。它的研究主题和对象是bipolar disorder(BIP,双相情感障碍,又称躁狂抑郁症)。通过整合既往关于该疾病的大规模组学数据,包括两个基于芯片的GWAS队列(PGC2和PGC3,分别贡献了20352例BIP病例和31358名对照、41917例BIP和371549对照),也包括后续多项研究的WGS/WES测序数据,还包括大规模脑组织的转录组测序数据(表达谱数据),并通过各类组学分析方法,提供了对这些数据的功能注释、连锁关联、蛋白质相互作用、时空表达模式等信息。所有这些信息都以网站形式提供查询和在线分析功能。这是典型的生物信息学类型研究工作,也是深入开启某一研究方向的有效开局方式。
Database : the journal of biological databases and curation,
2022-07-02.
DOI: 10.1093/database/baac049
PMID: 35779245
Abstract:
Bipolar disorder (BIP) is one of the most common hereditary psychiatric disorders worldwide. Elucidating the genetic basis of BIP will play a pivotal role in mechanistic delineation. Genome-wide association studies …
>>>
Bipolar disorder (BIP) is one of the most common hereditary psychiatric disorders worldwide. Elucidating the genetic basis of BIP will play a pivotal role in mechanistic delineation. Genome-wide association studies (GWAS) have successfully reported multiple susceptibility loci conferring BIP risk, thus providing insight into the effects of its underlying pathobiology. However, difficulties remain in the extrication of important and biologically relevant data from genetic discoveries related to psychiatric disorders such as BIP. There is an urgent need for an integrated and comprehensive online database with unified access to genetic and multi-omics data for in-depth data mining. Here, we developed the dbBIP, a database for BIP genetic research based on published data. The dbBIP consists of several modules, i.e.: (i) single nucleotide polymorphism (SNP) module, containing large-scale GWAS genetic summary statistics and functional annotation information relevant to risk variants; (ii) gene module, containing BIP-related candidate risk genes from various sources and (iii) analysis module, providing a simple and user-friendly interface to analyze one's own data. We also conducted extensive analyses, including functional SNP annotation, integration (including summary-data-based Mendelian randomization and transcriptome-wide association studies), co-expression, gene expression, tissue expression, protein-protein interaction and brain expression quantitative trait loci analyses, thus shedding light on the genetic causes of BIP. Finally, we developed a graphical browser with powerful search tools to facilitate data navigation and access. The dbBIP provides a comprehensive resource for BIP genetic research as well as an integrated analysis platform for researchers and can be accessed online at http://dbbip.xialab.info. Database URL: http://dbbip.xialab.info.
<<<
翻译
87.
颜林林
(2022-07-04 20:59):
#paper doi:10.1038/s41467-022-31236-0, Nature Communications, 2022, A convolutional neural network highlights mutations relevant to antimicrobial resistance in Mycobacterium tuberculosis. 本文建立了一套CNN(卷积神经网络)模型,从2万多个结核分枝杆菌的测序数据中,使用18个根据先验知识挑选的与其耐药性相关的基因座,将基因座的整个序列作为输入,以此来预测耐药性。结果显示,该CNN模型性能超过了目前其他基于传统机器学习方法和非卷积的常规神经网络方法。而且,由于深度学习方法提取了序列中的隐含特征信息,可以有效帮助预测未知突变对耐药性的影响。
IF:14.700Q1
Nature communications,
2022-07-02.
DOI: 10.1038/s41467-022-31236-0
PMID: 35780211
PMCID:PMC9250494
Abstract:
Long diagnostic wait times hinder international efforts to address antibiotic resistance in M. tuberculosis. Pathogen whole genome sequencing, coupled with statistical and machine learning models, offers a promising solution. However, …
>>>
Long diagnostic wait times hinder international efforts to address antibiotic resistance in M. tuberculosis. Pathogen whole genome sequencing, coupled with statistical and machine learning models, offers a promising solution. However, generalizability and clinical adoption have been limited by a lack of interpretability, especially in deep learning methods. Here, we present two deep convolutional neural networks that predict antibiotic resistance phenotypes of M. tuberculosis isolates: a multi-drug CNN (MD-CNN), that predicts resistance to 13 antibiotics based on 18 genomic loci, with AUCs 82.6-99.5% and higher sensitivity than state-of-the-art methods; and a set of 13 single-drug CNNs (SD-CNN) with AUCs 80.1-97.1% and higher specificity than the previous state-of-the-art. Using saliency methods to evaluate the contribution of input sequence features to the SD-CNN predictions, we identify 18 sites in the genome not previously associated with resistance. The CNN models permit functional variant discovery, biologically meaningful interpretation, and clinical applicability.
<<<
翻译
88.
颜林林
(2022-07-03 00:04):
#paper doi:10.1002/ajmg.c.31987 American Journal of Medical Genetics, 2022, Genetic testing and glomerular hematuria - A nephrologist's perspective. 这篇综述介绍了Alport综合征(一种遗传性肾炎)的诊断和早期治疗方法的演变。该疾病表现为血尿,但并非急性外伤引起,而是与慢性炎症相关,且具有遗传性。该疾病发现于1920年,但直至2003年才被报道有药物可以进行治疗(之前只能选择透析和肾移植)。长期的临床病例积累和观察研究,确定了该疾病的遗传性,以及定位出COL4A3、COL4A4和COL4A5这三个基因与该疾病相关。由于血尿的原因很多,Alport综合征也存在各种不同程度症状的谱系分布,因此其诊断也需要开展对上述三个基因的突变检测。基因检测方法早期使用Sanger(一代测序),后来改为使用NGS(新一代测序,或者称为二代测序),无论哪种方法,都存在费用高昂等问题。在临床肾病专家的角度,会通过显微镜观察尿液中血细胞的形态等特征,帮助确定血尿的来源是否为肾小球,并综合考虑患者个体因素,确定是采取基因检测方法,或是肾活检方法。各种检测方法都并不完美,需要通过彼此互补来帮助进行疾病确诊。诸如对三个基因的检测,在NGS时代可以开展全外显子测序,不仅可能发现这三个基因上从未被报道过的难以判断致病性的突变,也可能发现与此疾病相关的其他基因突变,这些突变的解读,则需要依赖于遗传咨询师的辅助配合。这篇综述中展示的临床诊治路径(及其演化),反映了对这些信息的综合利用,以及从使患者受益的角度,该以何种顺序来组合不同的检测方法。
Abstract:
Alport syndrome is an inherited disorder of the kidneys that results from variants in three collagen IV genes-COL4A3, COL4A4, and COL4A5. Early diagnosis and pharmacologic intervention can delay the progression …
>>>
Alport syndrome is an inherited disorder of the kidneys that results from variants in three collagen IV genes-COL4A3, COL4A4, and COL4A5. Early diagnosis and pharmacologic intervention can delay the progression of chronic kidney disease and the onset of kidney failure in patients with Alport syndrome. This article describes the evolution of approaches to the diagnosis and early treatment of Alport syndrome.
<<<
翻译
89.
颜林林
(2022-07-02 00:24):
#paper doi:10.1186/s12859-022-04798-5 BMC Bioinformatics, 2022, DeepPN: a deep parallel neural network based on convolutional neural network and graph convolutional network for predicting RNA-protein binding sites. 识别RNA与蛋白的结合位点(RBP),是研究基因调控的重要内容。传统采用免疫沉淀等方法进行高通量的筛选和测定,但实验方法存在诸多局限,故人们尝试开发了许多计算工具来预测RBP,这其中大多为根据序列和结构信息进行数学计算的方法。深度学习技术,由于能够自动根据数据学习到重要且复杂的隐藏特征,因此也逐步被应用到这个问题上来。本文的研究,在考虑深度学习技术时,采用了图卷积网络(GCN)中的ChebNet。该方法过去多被用于光谱数据,且近年的研究在fMRI、图像语义分割等领域也都取得不错效果。于是本文基于CNN和ChebNet搭建了一个名为DeepPN的并行深度神经网络,并在24个真实数据集上进行测试,效果优于其他同类方法。推测可能是由于本文方法利用了统计频率来补充特征,因此取得了更好的性能。
Abstract:
BACKGROUND: Addressing the laborious nature of traditional biological experiments by using an efficient computational approach to analyze RNA-binding proteins (RBPs) binding sites has always been a challenging task. RBPs play …
>>>
BACKGROUND: Addressing the laborious nature of traditional biological experiments by using an efficient computational approach to analyze RNA-binding proteins (RBPs) binding sites has always been a challenging task. RBPs play a vital role in post-transcriptional control. Identification of RBPs binding sites is a key step for the anatomy of the essential mechanism of gene regulation by controlling splicing, stability, localization and translation. Traditional methods for detecting RBPs binding sites are time-consuming and computationally-intensive. Recently, the computational method has been incorporated in researches of RBPs. Nevertheless, lots of them not only rely on the sequence data of RNA but also need additional data, for example the secondary structural data of RNA, to improve the performance of prediction, which needs the pre-work to prepare the learnable representation of structural data.RESULTS: To reduce the dependency of those pre-work, in this paper, we introduce DeepPN, a deep parallel neural network that is constructed with a convolutional neural network (CNN) and graph convolutional network (GCN) for detecting RBPs binding sites. It includes a two-layer CNN and GCN in parallel to extract the hidden features, followed by a fully connected layer to make the prediction. DeepPN discriminates the RBP binding sites on learnable representation of RNA sequences, which only uses the sequence data without using other data, for example the secondary or tertiary structure data of RNA. DeepPN is evaluated on 24 datasets of RBPs binding sites with other state-of-the-art methods. The results show that the performance of DeepPN is comparable to the published methods.CONCLUSION: The experimental results show that DeepPN can effectively capture potential hidden features in RBPs and use these features for effective prediction of binding sites.
<<<
翻译
90.
颜林林
(2022-07-01 07:57):
#paper doi:10.1101/2022.06.27.497710 bioRxiv, 2022, PaliDIS: A tool for fast discovery of novel insertion sequences. 这是一篇有关的生信工具的文章,通讯作者来自Wellcome Sanger Institute。该工具从宏基因组数据中,寻找彼此之间含有相同重复片段的序列,将其比对到各组装好的微生物基因组上,将连锁位于同一组装序列且彼此反向互补的重复片段筛选出来,并经过一系列质控过滤,从而鉴别出在微生物基因组上发生的倒位形式的移动元件,以此帮助对耐药基因及其在不同菌种之间传播进行研究。类似流程在人类基因组分析中并不少见,且基本都是根据基因组事件及其序列特征直接进行实现,方法本身算不上有什么特别的创新之处。只不过应用于特定场景的特定数据集(在这篇文章里,数据是来自HMP,Human Microbiome Project,人类微生物计划),对分析结果进行(关于该移动元件的)统计描述和分析,倒是可行且常见的研究套路。
bioRxiv,
2022.
DOI: 10.1101/2022.06.27.497710
Abstract:
The diversity of microbial insertion sequences, crucial mobile genetic elements in generating diversity in microbial genomes, needs to be better represented in current microbial databases. Identification of these sequences in …
>>>
The diversity of microbial insertion sequences, crucial mobile genetic elements in generating diversity in microbial genomes, needs to be better represented in current microbial databases. Identification of these sequences in microbiome communities presents some significant problems that have led to their underrepresentation. Here, we present a software tool called PaliDIS that recognises insertion sequences in metagenomic sequence data rapidly by identifying inverted terminal repeat regions from mixed microbial community genomes. Applying this software to 266 human metagenomes identifies 11,681 unique insertion sequences. Querying this catalogue against a large database of isolate genomes reveals evidence of horizontal gene transfer events of clinically relevant antimicrobial resistance genes between classes of bacteria. We will continue to apply this tool more widely, building the Insertion Sequence Catalogue, a valuable resource for researchers wishing to query their microbial genomes for insertion sequences.
<<<
翻译
91.
颜林林
(2022-06-30 00:17):
#paper doi:10.1038/s41597-022-01450-y Scientific Data, 2022, HunCRC: annotated pathological slides to enhance deep learning applications in colorectal cancer screening. 《Nature》子刊《Scientific Data》确实是宝藏。这篇来自匈牙利的论文,就分享了一组很有用的数据。取材了200张H&E染色的结直肠癌的肿瘤组织切片,使用40倍高分辨率扫描全片,然后由病理医生进行标注,从中切分出多个不同类别的图像块,可用于后续结直肠癌的各类病理图像分析研究。值得夸赞的是,从样本采集到数据处理,整个过程有详细描述,数据处理代码、带标注的原始图像、处理后的带分类信息的图像块,全部都开放供直接下载使用。
代码地址:
https://github.com/qbeer/qupath-binarymask-extension
https://github.com/patbaa/crc_data_paper
原始图像数据:
https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=91357370
处理后数据:
https://figshare.com/articles/dataset/patches_and_local_annotations_slide_200_zoom_124x124_um2/19500266
Abstract:
Histopathology is the gold standard method for staging and grading human tumors and provides critical information for the oncoteam's decision making. Highly-trained pathologists are needed for careful microscopic analysis of …
>>>
Histopathology is the gold standard method for staging and grading human tumors and provides critical information for the oncoteam's decision making. Highly-trained pathologists are needed for careful microscopic analysis of the slides produced from tissue taken from biopsy. This is a time-consuming process. A reliable decision support system would assist healthcare systems that often suffer from a shortage of pathologists. Recent advances in digital pathology allow for high-resolution digitalization of pathological slides. Digital slide scanners combined with modern computer vision models, such as convolutional neural networks, can help pathologists in their everyday work, resulting in shortened diagnosis times. In this study, 200 digital whole-slide images are published which were collected via hematoxylin-eosin stained colorectal biopsy. Alongside the whole-slide images, detailed region level annotations are also provided for ten relevant pathological classes. The 200 digital slides, after pre-processing, resulted in 101,389 patches. A single patch is a 512 × 512 pixel image, covering 248 × 248 μm tissue area. Versions at higher resolution are available as well. Hopefully, HunCRC, this widely accessible dataset will aid future colorectal cancer computer-aided diagnosis and research.
<<<
翻译
92.
颜林林
(2022-06-29 22:30):
#paper doi:10.1002/humu.24424 Human Mutation, 2022, Screening of potential novel candidate genes in schwannomatosis patients. 这篇论文研究的是神经鞘瘤病(Schwannomatosis),是一种由周围神经的神经鞘所形成的肿瘤,该疾病与遗传有很大关系,通常会筛查NF2、SMARCB1和LZTR1这三个基因的胚系突变。然而,仍有相当大比例的患者并不携带这三个基因的突变,提示存在其他致病基因,本文则为寻找这样的基因。研究纳入了来自75个家庭的散发患者,这些患者均经筛查未携带上述三个基因的致病突变,于是采用NGS、MLPA、PCR+Sanger等方法,扩展筛查范围,找到DGCR8、COQ6、CDKN2A和CDKN2B等基因携带致病突变,结合既往文献研究,推断它们与该疾病发生相关,为后续研究该疾病的发病机制提供了证据提示。本文的研究逻辑和方法,也是拓展遗传病致病基因的常规研究套路。
Abstract:
Schwannomatosis comprises a group of hereditary tumor predisposition syndromes characterized by, usually benign, multiple nerve sheath tumors, which frequently cause severe pain that does not typically respond to drug treatments. …
>>>
Schwannomatosis comprises a group of hereditary tumor predisposition syndromes characterized by, usually benign, multiple nerve sheath tumors, which frequently cause severe pain that does not typically respond to drug treatments. The most common schwannomatosis-associated gene is NF2, but SMARCB1 and LZTR1 are also associated. There are still many cases in which no pathogenic variants (PVs) have been identified, suggesting the existence of as yet unidentified genetic risk factors. In this study, we performed extended genetic screening of 75 unrelated schwannomatosis patients without identified germline PVs in NF2, LZTR1, or SMARCB1. Screening of the coding region of DGCR8, COQ6, CDKN2A, and CDKN2B was carried out, based on previous reports that point to these genes as potential candidate genes for schwannomatosis. Deletions or duplications in CDKN2A, CDKN2B, and adjacent chromosome 9 region were assessed by multiplex ligation-dependent probe amplification analysis. Sequencing analysis of a patient with multiple schwannomas and melanomas identified a novel duplication in the coding region of CDKN2A, disrupting both p14ARF and p16INK4a. Our results suggest that none of these genes are major contributors to schwannomatosis risk but the possibility remains that they may have a role in more complex mechanisms for tumor predisposition.
<<<
翻译
93.
颜林林
(2022-06-28 07:39):
#paper doi:10.1101/2022.06.22.497216 bioRxiv, 2022, Intratumoral mregDC and CXCL13 T helper niches enable local differentiation of CD8 T cells following PD-1 blockade. 这篇文章来自西奈山伊坎医学院,其病例队列出自一项用于非小细胞肺癌(NSCLC)、肝细胞癌(HCC)和头颈部鳞癌(HNSCC)的手术前抗PD-1免疫药物(西米普利单抗,Cemiplimab)新辅助治疗的多中心II期临床试验(NCT03916627,该临床试验尚在进行中,始于2019年,预计2024年完成)。本文仅针对其中的肝细胞癌患者,通过对其新辅助治疗后手术取样组织,开展TCR测序、全外显子测序、单细胞转录组测序、多重免疫组化等实验,寻找与新辅助治疗疗效相关的特定细胞类群。通过免疫组化和免疫荧光方法,确认在肿瘤中确实富含T细胞并浸润其中的患者,仍有部分患者对PD-1药物并无响应。对比响应者与无响应者之间的细胞类群组成差异,找到一个细胞类群组合,成熟调节树突状细胞(mregDC,LAMP3+)与 CXCL13+ CD4+ 辅助性T细胞,它们与 PD-1高表达的CD8+ T细胞前体结合,形成三元组,促使后者形成 PD-1高表达的 GZMK+ 效应T细胞。而在没有这两类细胞的情况下,后者将形成耗竭型CD8+ T细胞。这导致了该新辅助治疗的不同预后结局。这项研究也为进一步揭示免疫治疗相关机制提供了新的证据。
bioRxiv,
2022.
DOI: 10.1101/2022.06.22.497216
Abstract:
Here, we leveraged a large neoadjuvant PD-1 blockade trial in patients with hepatocellular carcinoma (HCC) to search for correlates of response to immune checkpoint blockade (ICB) within T cell-rich tumors. …
>>>
Here, we leveraged a large neoadjuvant PD-1 blockade trial in patients with hepatocellular carcinoma (HCC) to search for correlates of response to immune checkpoint blockade (ICB) within T cell-rich tumors. We show that ICB response correlated with the clonal expansion of intratumoral CXCL13+ CH25H+ IL-21+ PD-1+ CD4 T helper cells (CXCL13+ Th) and Granzyme K+ PD-1+ effector-like CD8 T cells, whereas terminally exhausted CD39hi TOXhi PD-1hi CD8 T cells dominated in non-responders. Strikingly, most T cell receptor (TCR) clones that expanded post-treatment were found in pre-treatment biopsies. Notably, PD-1+ TCF-1+ progenitor-like CD8 T cells were present in tumors of responders and non-responders and shared clones mainly with effector-like cells in responders or terminally differentiated cells in non-responders, suggesting that local CD8 T cell differentiation occurs upon ICB. We found that these progenitor CD8 T cells interact with CXCL13+ Th cells within cellular triads around dendritic cells enriched in maturation and regulatory molecules, or "mregDC". Receptor-ligand analysis revealed unique interactions within these triads that may promote the differentiation of progenitor CD8 T cells into effector-like cells upon ICB. These results suggest that discrete intratumoral niches that include mregDC and CXCL13+ Th cells control the differentiation of tumor-specific progenitor CD8 T cell clones in patients treated with ICB.
<<<
翻译
94.
颜林林
(2022-06-27 00:24):
#paper doi:10.3390/diagnostics12061493 Diagnostics, 2022, MixPatch: A New Method for Training Histopathology Image Classifiers. 病理图像分析中,由于原始全片数据量太大(通常为5万x5万像素),很难直接丢入DNN模型,故通常会进行切分,形成大量图块(patch),逐一进行分析(训练或预测)。对于每个图块,一般会由病理医生进行注释,确定其临床特征(如是否恶性肿瘤区域)。该临床特征一般是“是或否”的二分状态。然而,事实上很多分块会同时包含良性或恶性的不同类型区域,这种“不确定”的图块,会造成模型的误判和性能损失。本文的研究,采取最小图块(128x128像素,被病理医生认为最小可识别区域),以便给出“干净”的金标准数据集,并在此基础上,合并相邻最小图块(一般9个或16个,即3x3或4x4),得到“混合的图块(mix patch)”,并根据组合前原始信息,给出对该“混合图块”的结果的可信度估计。这其实是个模糊集合的理念。而通过这般操作,使得病理分析的性能得到了提升,且在对全片水平(slide level)进行的预测中也取得了更好的结果。
Abstract:
CNN-based image processing has been actively applied to histopathological analysis to detect and classify cancerous tumors automatically. However, CNN-based classifiers generally predict a label with overconfidence, which becomes a serious …
>>>
CNN-based image processing has been actively applied to histopathological analysis to detect and classify cancerous tumors automatically. However, CNN-based classifiers generally predict a label with overconfidence, which becomes a serious problem in the medical domain. The objective of this study is to propose a new training method, called MixPatch, designed to improve a CNN-based classifier by specifically addressing the prediction uncertainty problem and examine its effectiveness in improving diagnosis performance in the context of histopathological image analysis. MixPatch generates and uses a new sub-training dataset, which consists of mixed-patches and their predefined ground-truth labels, for every single mini-batch. Mixed-patches are generated using a small size of clean patches confirmed by pathologists while their ground-truth labels are defined using a proportion-based soft labeling method. Our results obtained using a large histopathological image dataset shows that the proposed method performs better and alleviates overconfidence more effectively than any other method examined in the study. More specifically, our model showed 97.06% accuracy, an increase of 1.6% to 12.18%, while achieving 0.76% of expected calibration error, a decrease of 0.6% to 6.3%, over the other models. By specifically considering the mixed-region variation characteristics of histopathology images, MixPatch augments the extant mixed image methods for medical image analysis in which prediction uncertainty is a crucial issue. The proposed method provides a new way to systematically alleviate the overconfidence problem of CNN-based classifiers and improve their prediction accuracy, contributing toward more calibrated and reliable histopathology image analysis.
<<<
翻译
95.
颜林林
(2022-06-26 22:13):
#paper doi:10.1371/journal.pcbi.1009730 PLOS Computational Biology, 2022, Improved transcriptome assembly using a hybrid of long and short reads with StringTie. 这篇文章来自Johns Hopkins,开发了一个能够混合使用长读长及短读长测序数据进行转录组拼装的工具。高通量测序数据中,短读长平台的准确性高,但读长较短,难以覆盖完整转录本,而长读长平台虽然可以跨越多个外显子,帮助确定转录本剪切方式,但由于碱基准确度相对较差,因而也容易在比对时造成错误,影响转录本的确定。本文的工具,展示了由于测序错误导致的“嘈杂”比对,以及由此导致的搜索空间大幅增加。通过使用图论中的最大流量问题的解法,以及在“嘈杂”比对局部使用更准确的短读长数据,帮助确定正确的剪切位点,从而实现综合两种平台(长读长与短读长)的优势,且运算速度也并不弱于以往使用单一数据的工具算法。为评估此工具,本文除了使用模拟数据外,同时也选择了拟南芥、小鼠和人的多套真实数据集,在组装精读和输出的可正确注释的转录本等方面,都表现出符合预期的更好成绩。
Abstract:
Short-read RNA sequencing and long-read RNA sequencing each have their strengths and weaknesses for transcriptome assembly. While short reads are highly accurate, they are rarely able to span multiple exons. …
>>>
Short-read RNA sequencing and long-read RNA sequencing each have their strengths and weaknesses for transcriptome assembly. While short reads are highly accurate, they are rarely able to span multiple exons. Long-read technology can capture full-length transcripts, but its relatively high error rate often leads to mis-identified splice sites. Here we present a new release of StringTie that performs hybrid-read assembly. By taking advantage of the strengths of both long and short reads, hybrid-read assembly with StringTie is more accurate than long-read only or short-read only assembly, and on some datasets it can more than double the number of correctly assembled transcripts, while obtaining substantially higher precision than the long-read data assembly alone. Here we demonstrate the improved accuracy on simulated data and real data from Arabidopsis thaliana, Mus musculus, and human. We also show that hybrid-read assembly is more accurate than correcting long reads prior to assembly while also being substantially faster. StringTie is freely available as open source software at https://github.com/gpertea/stringtie.
<<<
翻译
96.
颜林林
(2022-06-25 20:26):
#paper doi:10.3390/s22124409 Sensors, 2022, Deep Neural Networks Applied to Stock Market Sentiment Analysis. 这篇来自葡萄牙的关于深度学习技术应用的论文,被发现和推送自PubMed(PMID:35746192)。文章主要介绍了如何使用深度神经网络,从社交网站(Twitter、Reddit等)的文字内容,推断其情绪分类(积极或消极),并利用此情绪结果,进行模拟投资,以评估其投资收益率。文章内容算不上有太多创新价值,不过其认真介绍DL技术原理、实现和评估过程,倒是有点像一篇教程。反而是关于股市及投资的内容,显得有些割裂,像是强行补充。因为其深度模型的性能评估,也还是仅仅针对情绪分类进行的。作者在文末展望之处还提到,后续打算引入数据流技术(data streaming technology),使该分析过程能够实时进行,倒或许会指出更多合适的新应用场景。
Abstract:
The volume of data is growing exponentially and becoming more valuable to organizations that collect it, from e-commerce data, shipping, audio and video logs, text messages, internet search queries, stock …
>>>
The volume of data is growing exponentially and becoming more valuable to organizations that collect it, from e-commerce data, shipping, audio and video logs, text messages, internet search queries, stock market activity, financial transactions, the Internet of Things, and various other sources. The major challenges are related with the way to extract insights from such a rich data environment and whether Deep Learning can be successful with Big Data. To get some insight on these topics, social network data are employed as a case study on how sentiments can affect decisions in stock market environments. In this paper, we propose a generalized Deep Learning-based classification framework for Stock Market Sentiment Analysis. This work comprises the study, the development, and implementation of an automatic classification system based on Deep Learning and the validation of its adequacy and efficiency in any scenario, particularly Stock Market Sentiment Analysis. Distinct datasets and several Deep Learning approaches with different layers and embedded techniques are used, and their performances are evaluated. These developments show how Deep Learning reacts to distinct contexts. The results also give context on how different techniques with different parameter combinations react to certain types of data. Convolution obtained the best results when dealing with complex data inputs, and long short-term layers kept a memory of data, allowing inputs which are not as common to still be considered for decisions. The models that resulted from Stock Market Sentiment Analysis datasets were applied with some success to real-life problems. The best models reached accuracies of 73% in training and 69% in certain test datasets. In a simulation, a model was able to provide a Return on Investment of 4.4%. The results contribute to understanding how to process Big Data efficiently using Deep Learning and specialized hardware techniques.
<<<
翻译
97.
颜林林
(2022-06-24 21:32):
#paper doi:10.1038/s41587-022-01294-2 Nature Biotechnology, 2022, The clinical progress of mRNA vaccines and immunotherapies. 这是一篇关于mRNA疫苗的长篇综述。使用mRNA作为载体开发疫苗的概念,始于1990年,它通过借用接种者身体内的蛋白质翻译机制来产生靶蛋白,而非直接注射(灭活或减活)病原体或靶蛋白本身。这种方式带来一系列优点,诸如设计简便、固有免疫原性、可快速量产等。当然,它也存在诸如稳定性差、疫苗在体内递送至目标位置困难等缺点或挑战。在新冠疫情爆发以来的这三年里,借着大量资金投入增加、紧急使用授权等机会,mRNA疫苗的研发及投产使用得到了极大加速。本文对这些发展,包括给药递送方法,针对传染病的疫苗研发、使用及优化,针对癌症治疗的疫苗方法,mRNA疫苗在蛋白质和细胞免疫治疗中的使用等,都做了比较详细的综述介绍,并据此讨论了当前存在的问题和未来研发方向。通篇读下来,能对mRNA疫苗及其技术路线形成比较深入的了解,也确实能体会到这是个潜力巨大、值得探索和继续研发的重要技术体系。
Abstract:
The emergency use authorizations (EUAs) of two mRNA-based severe acute respiratory syndrome coronavirus (SARS-CoV)-2 vaccines approximately 11 months after publication of the viral sequence highlights the transformative potential of this …
>>>
The emergency use authorizations (EUAs) of two mRNA-based severe acute respiratory syndrome coronavirus (SARS-CoV)-2 vaccines approximately 11 months after publication of the viral sequence highlights the transformative potential of this nucleic acid technology. Most clinical applications of mRNA to date have focused on vaccines for infectious disease and cancer for which low doses, low protein expression and local delivery can be effective because of the inherent immunostimulatory properties of some mRNA species and formulations. In addition, work on mRNA-encoded protein or cellular immunotherapies has also begun, for which minimal immune stimulation, high protein expression in target cells and tissues, and the need for repeated administration have led to additional manufacturing and formulation challenges for clinical translation. Building on this momentum, the past year has seen clinical progress with second-generation coronavirus disease 2019 (COVID-19) vaccines, Omicron-specific boosters and vaccines against seasonal influenza, Epstein-Barr virus, human immunodeficiency virus (HIV) and cancer. Here we review the clinical progress of mRNA therapy as well as provide an overview and future outlook of the transformative technology behind these mRNA-based drugs.
<<<
翻译
98.
颜林林
(2022-06-23 07:02):
#paper doi:10.1186/s12859-022-04768-x BMC Bioinformatics, 2022, Using BERT to identify drug-target interactions from whole PubMed. 这篇文章通过使用自然语言处理技术中BERT模型,批量分析了PubMed和PMC的全数据库,从文章中识别出药物和蛋白质信息,并提取药物-靶点相互作用(DTI)数据,包括对应所使用的实验方法类别等重要信息。通过本文的方法,新识别出的60万篇文章,都未被公共DTI数据库所包含。通过人工抽查审核和较差验证的方法,确认了该方法的准确度(99%以上)。通常这类数据的文献挖掘和整理,都依赖于人工,在效率上存在很大局限。诸如本文的人工智能方法,将为药物发现和重定位、加快药物开发等提供帮助。
Abstract:
BACKGROUND: Drug-target interactions (DTIs) are critical for drug repurposing and elucidation of drug mechanisms, and are manually curated by large databases, such as ChEMBL, BindingDB, DrugBank and DrugTargetCommons. However, the …
>>>
BACKGROUND: Drug-target interactions (DTIs) are critical for drug repurposing and elucidation of drug mechanisms, and are manually curated by large databases, such as ChEMBL, BindingDB, DrugBank and DrugTargetCommons. However, the number of curated articles likely constitutes only a fraction of all the articles that contain experimentally determined DTIs. Finding such articles and extracting the experimental information is a challenging task, and there is a pressing need for systematic approaches to assist the curation of DTIs. To this end, we applied Bidirectional Encoder Representations from Transformers (BERT) to identify such articles. Because DTI data intimately depends on the type of assays used to generate it, we also aimed to incorporate functions to predict the assay format.RESULTS: Our novel method identified 0.6 million articles (along with drug and protein information) which are not previously included in public DTI databases. Using 10-fold cross-validation, we obtained ~ 99% accuracy for identifying articles containing quantitative drug-target profiles. The F1 micro for the prediction of assay format is 88%, which leaves room for improvement in future studies.CONCLUSION: The BERT model in this study is robust and the proposed pipeline can be used to identify previously overlooked articles containing quantitative DTIs. Overall, our method provides a significant advancement in machine-assisted DTI extraction and curation. We expect it to be a useful addition to drug mechanism discovery and repurposing.
<<<
翻译
99.
颜林林
(2022-06-22 00:43):
#paper doi:10.1038/s41591-022-01768-5 Nature Medicine, 2022, Swarm learning for decentralized artificial intelligence in cancer histopathology. 前段时间刚在Nature上一篇文章(doi:10.1038/s41586-021-03583-3)读到Swarm learning(群体学习),该文提及一种在不违反隐私法规的前提下进行临床数据共享,从而帮助针对那些普遍存在异质性的疾病开展精准医学研究。本文则是针对肿瘤病理图像分析,也使用群体学习技术。病理图像分析,是典型的需要依赖大量高质量数据集的研究方向,群体学习正好使得合作单位可以共同训练AI模型,同时又避免数据传输和数据垄断。本文基于来自爱尔兰、德国和美国的三个结直肠癌患者队列训练了模型,该模型通过分析患者的H&E染色切片,预测其驱动基因突变、dMMR突变和微卫星不稳定性状态(MSI)等,并在来自英国的两个独立队列数据集中进行模型的性能验证。在训练模型的三个数据节点(研究中心)之间,并不直接传递原始数据,而是在每次迭代步骤中,通过去中心化的区块链技术,进行模型参数的同步。也因此,各数据节点之间是对等的,并没有需要汇总其他节点的特殊中心节点。这种模式为将来拓展到更大范围、更多机构的合作,提供了可能性,也将使病理图像分析模型得到更大进步。
IF:58.700Q1
Nature medicine,
2022-06.
DOI: 10.1038/s41591-022-01768-5
PMID: 35469069
PMCID:PMC9205774
Abstract:
Artificial intelligence (AI) can predict the presence of molecular alterations directly from routine histopathology slides. However, training robust AI systems requires large datasets for which data collection faces practical, ethical …
>>>
Artificial intelligence (AI) can predict the presence of molecular alterations directly from routine histopathology slides. However, training robust AI systems requires large datasets for which data collection faces practical, ethical and legal obstacles. These obstacles could be overcome with swarm learning (SL), in which partners jointly train AI models while avoiding data transfer and monopolistic data governance. Here, we demonstrate the successful use of SL in large, multicentric datasets of gigapixel histopathology images from over 5,000 patients. We show that AI models trained using SL can predict BRAF mutational status and microsatellite instability directly from hematoxylin and eosin (H&E)-stained pathology slides of colorectal cancer. We trained AI models on three patient cohorts from Northern Ireland, Germany and the United States, and validated the prediction performance in two independent datasets from the United Kingdom. Our data show that SL-trained AI models outperform most locally trained models, and perform on par with models that are trained on the merged datasets. In addition, we show that SL-based AI models are data efficient. In the future, SL can be used to train distributed AI models for any histopathology image analysis task, eliminating the need for data transfer.
<<<
翻译
100.
颜林林
(2022-06-21 00:03):
#paper doi:10.1016/j.jmoldx.2022.05.003 The Journal of Molecular Diagnostics, 2022, Comprehensive Validation of Diagnostic Next-Generation Sequencing Panels for Acute Myeloid Leukemia (AML) Patients. 这是来自瑞士和德国的一篇关于血液肿瘤基因检测panel验证的文章。通常认为,肿瘤是遗传病,即由于遗传物质发生突变而导致的疾病。因此,在诊断和治疗决策过程中,会需要开展特定基因的检测。在临床实践上,可以采取panel富集特定DNA片段进行测序的方法,这也是目前肿瘤相关基因检测商业服务的基本模式。这种检测服务得以上市的前提,是需要经过充分的验证。本文便是这样一个验证过程的实例。本文的验证对象,是为诊断AML(急性髓系白血病)的panel,验证过程纳入了26例AML患者的33个DNA样本(骨髓或外周血),以及Acrometrix Oncology Hotspot Control DNA作为对照。对这些样本中携带的AML相关突变进行了检测和性能评价。而临床样本中的突变,也采用qPCR、Sanger测序等方法进行了确认。通过评估,从四个不同panel及多种分析软件中,选出了针对血液病性能最佳的panel及软件组合。
The Journal of molecular diagnostics : JMD,
2022-08.
DOI: 10.1016/j.jmoldx.2022.05.003
PMID: 35718092
Abstract:
Next-generation sequencing has greatly advanced the molecular diagnostics of malignant hematological diseases and provides useful information for clinical decision making. Studies have shown that certain mutations are associated with prognosis …
>>>
Next-generation sequencing has greatly advanced the molecular diagnostics of malignant hematological diseases and provides useful information for clinical decision making. Studies have shown that certain mutations are associated with prognosis and have a direct impact on treatment of affected patients. Therefore, reliable detection of pathogenic variants is critically important. Here, we compared four sequencing panels with different characteristics, from number of genes covered to technical aspects of library preparation and data analysis workflows, to find the panel with the best clinical utility for myeloid neoplasms with a special focus on acute myeloid leukemia. Using the Acrometrix Oncology Hotspot Control DNA and DNA from acute myeloid leukemia patients, panel performance was evaluated in terms of coverage, precision, recall, and reproducibility and different bioinformatics tools that can be used for the evaluation of any next-generation sequencing panel were tested. Taken together, our results support the reliability of the Acrometrix Oncology Hotspot Control to validate and compare sequencing panels for hematological diseases and show which panel-software combination (platform) has the best performance.
<<<
翻译