思考问题的熊
(2022-03-20 16:35):
#paper Li, Yumei, Xinzhou Ge, Fanglue Peng, Wei Li, and Jingyi Jessica Li. “Exaggerated False Positives by Popular Differential Expression Methods When Analyzing Human Population Samples.” Genome Biology 23, no. 1 (March 15, 2022): 79. https://doi.org/10.1186/s13059-022-02648-4.
前几天发表在 Genome Biology 的一篇论文,算是比较严谨地论证了在大样本量RNA-seq差异分析时,今后即便不考虑速度因素,也应该抛弃DEseq2和edgeR转而使用朴实无华的Wilcoxon秩和检验。
更具体的内容已经写成推送发出来了,感兴趣可以再看看。
IF:10.100Q1
Genome biology,
2022-03-15.
DOI: 10.1186/s13059-022-02648-4
PMID: 35292087
PMCID:PMC8922736
Exaggerated false positives by popular differential expression methods when analyzing human population samples
翻译
在分析人类群体样本时,流行的差异表达方法夸大了假阳性
Abstract:
When identifying differentially expressed genes between two conditions using human population RNA-seq samples, we found a phenomenon by permutation analysis: two popular bioinformatics methods, DESeq2 and edgeR, have unexpectedly high false discovery rates. Expanding the analysis to limma-voom, NOISeq, dearseq, and Wilcoxon rank-sum test, we found that FDR control is often failed except for the Wilcoxon rank-sum test. Particularly, the actual FDRs of DESeq2 and edgeR sometimes exceed 20% when the target FDR is 5%. Based on these results, for population-level RNA-seq studies with large sample sizes, we recommend the Wilcoxon rank-sum test.
翻译
当使用人类群体 RNA-seq 样本鉴定两种情况之间的差异表达基因时,我们通过排列分析发现了一个现象:两种流行的生物信息学方法 DESeq2 和 edgeR 具有出乎意料的高错误发现率。将分析扩展到 limma-voom、NOISeq、dearseq 和 Wilcoxon 秩和检验,我们发现除了 Wilcoxon 秩和检验外,FDR 控制经常失败。特别是,当目标 FDR 为 5% 时,DESeq2 和 edgeR 的实际 FDR 有时会超过 20%。基于这些结果,对于样本量较大的群体水平 RNA-seq 研究,我们建议使用 Wilcoxon 秩和检验。