小擎子
(2023-04-30 23:14):
#paper doi: 10.1038/s41592-021-01141-3 Nat Methods, 2021, Challenges in Benchmarking Metagenomic Profilers. 文献提出了一个研究宏基因组中会遇到的问题,即计算相对丰度时,不同生信工具给出的统计结果不同。区别就是有的结果是给出的序列丰度(DNA to DNA),有的结果给出的是物种丰度(DNA to Marker)。序列丰度和物种丰度的差别在于,有没有将物种的基因组大小考虑在其中。序列丰度是不考虑物种基因组大小的(如Kraken)。文章认为,基于物种丰度(即考虑物种基因组大小)的结果更具有解释性,建议严谨解释宏基因组分析结果,特别是从序列丰度得出的结果。
Challenges in benchmarking metagenomic profilers
翻译
Abstract:
Accurate microbial identification and abundance estimation are crucial for metagenomics analysis. Various methods for classification of metagenomic data and estimation of taxonomic profiles, broadly referred to as metagenomic profilers, have been developed. Nevertheless, benchmarking of metagenomic profilers remains challenging because some tools are designed to report relative sequence abundance while others report relative taxonomic abundance. Here we show how misleading conclusions can be drawn by neglecting this distinction between relative abundance types when benchmarking metagenomic profilers. Moreover, we show compelling evidence that interchanging sequence abundance and taxonomic abundance will influence both per-sample summary statistics and cross-sample comparisons. We suggest that the microbiome research community pay attention to potentially misleading biological conclusions arising from this issue when benchmarking metagenomic profilers, by carefully considering the type of abundance data that were analyzed and interpreted and clearly stating the strategy used for metagenomic profiling.
翻译