Vincent
(2022-11-30 19:09):
#paper https://doi.org/10.1038/s41467-020-15298-6 nature communication, 2020, Integrative differential expression and gene set enrichment analysis using summary statistics for scRNA-seq studies. 基因表达差异分析和基因集富集分析是单细胞领域两个最常用的分析方式,但是两种分析往往是独立进行的,由于单细胞数据噪声较大,这样单独分析会造成统计效力的降低以及不同的数据集(或者使用不同方法分析同一套数据)得到的分析结果不一致。另一方面差异分析和富集分析其实在内部是紧密相连的,差异分析的结果是富集分析的基础,同时基因集富集分析反过来也可以反哺差异分析(基因之间并非独立,如果某基因差异表达了,与之相关的基因也可能差异表达),这意味着将两者结合起来同时分析能够提高统计效力并且使得分析结果更加稳健和可重复。这篇文章提出了一种新方法iDEA,该方法使用了层次贝叶斯模型,将差异分析和富集分析整合起来综合分析,通过仿真实验和真实数据分析,文章发现该方法较现有的差异或者富集方法有更高的统计效力,更一致的差异分析结果和更准确的富集分析结论
Integrative differential expression and gene set enrichment analysis using summary statistics for scRNA-seq studies
翻译
Abstract:
Differential expression (DE) analysis and gene set enrichment (GSE) analysis are commonly applied in single cell RNA sequencing (scRNA-seq) studies. Here, we develop an integrative and scalable computational method, iDEA, to perform joint DE and GSE analysis through a hierarchical Bayesian framework. By integrating DE and GSE analyses, iDEA can improve the power and consistency of DE analysis and the accuracy of GSE analysis. Importantly, iDEA uses only DE summary statistics as input, enabling effective data modeling through complementing and pairing with various existing DE methods. We illustrate the benefits of iDEA with extensive simulations. We also apply iDEA to analyze three scRNA-seq data sets, where iDEA achieves up to five-fold power gain over existing GSE methods and up to 64% power gain over existing DE methods. The power gain brought by iDEA allows us to identify many pathways that would not be identified by existing approaches in these data.
翻译