来自杂志 BMC genomics 的文献。
当前共找到 4 篇文献分享。
1.
钟鸣 (2022-10-31 17:38):
#paper doi:10.1186/s12864-022-08678-3 BMC Genomics,2022,From a large-scale genomic analysis of insertion sequences to insights into their regulatory roles in prokaryotes 插入序列(IS)作为可转移的外源序列,经常插入在原核生物基因组中。IS的插入有何影响?本文通过大范围的比较基因组分析探究了这个问题。在8481个基因组中鉴定到612700个IS插入,除了对这些IS和基因组类别进行分类描述外,作者还重点分析了IS的插入位置的偏好以及对基因组功能上的影响,他们发现IS普遍插入在基因功能与转录调控和转运活性有关的基因两侧,从而影响宿主的表型。IS影响宿主表型已是屡见不鲜,本研究从更广阔的范围内印证了这点,加深了我们对IS的了解,期望以后看到本领域更多的了解和新发现。
IF:3.500Q2 BMC genomics, 2022-Jun-20. DOI: 10.1186/s12864-022-08678-3 PMID: 35725380
Abstract:
BACKGROUND: Insertion sequences (ISs) are mobile repeat sequences and most of them can copy themselves to new host genome locations, leading to genome plasticity and gene regulation in prokaryotes. In … >>>
BACKGROUND: Insertion sequences (ISs) are mobile repeat sequences and most of them can copy themselves to new host genome locations, leading to genome plasticity and gene regulation in prokaryotes. In this study, we present functional and evolutionary relationships between IS and neighboring genes in a large-scale comparative genomic analysis.RESULTS: IS families were located in all prokaryotic phyla, with preferential occurrence of IS3, IS4, IS481, and IS5 families in Alpha-, Beta-, and Gammaproteobacteria, Actinobacteria and Firmicutes as well as in eukaryote host-associated organisms and autotrophic opportunistic pathogens. We defined the concept of the IS-Gene couple (IG), which allowed to highlight the functional and regulatory impacts of an IS on the closest gene. Genes involved in transcriptional regulation and transport activities were found overrepresented in IG. In particular, major facilitator superfamily (MFS) transporters, ATP-binding proteins and transposases raised as favorite neighboring gene functions of IS hotspots. Then, evolutionary conserved IS-Gene sets across taxonomic lineages enabled the classification of IS-gene couples into phylum, class-to-genus, and species syntenic IS-Gene couples. The IS5, IS21, IS4, IS607, IS91, ISL3 and IS200 families displayed two to four times more ISs in the phylum and/or class-to-genus syntenic IGs compared to other IS families. This indicates that those families were probably inserted earlier than others and then subjected to horizontal transfer, transposition and deletion events over time. In phylum syntenic IG category, Betaproteobacteria, Crenarchaeota, Calditrichae, Planctomycetes, Acidithiobacillia and Cyanobacteria phyla act as IS reservoirs for other phyla, and neighboring gene functions are mostly related to transcriptional regulators. Comparison of IS occurrences with predicted regulatory motifs led to ~ 26.5% of motif-containing ISs with 2 motifs per IS in average. These results, concomitantly with short IS-Gene distances, suggest that those ISs would interfere with the expression of neighboring genes and thus form strong candidates for an adaptive pairing.CONCLUSIONS: All together, our large-scale study provide new insights into the IS genetic context and strongly suggest their regulatory roles. <<<
翻译
2.
颜林林 (2022-07-24 05:55):
#paper doi:10.1186/s12864-022-08762-8 BMC Genomics, 2022, Poly(a) selection introduces bias and undue noise in direct RNA-sequencing. 全转录组测序实验中,在初始的RNA提取环节后,经常会使用poly-A筛选方法,来富集mRNA。本文使用ONT平台,开展直接RNA测序(direct RNA-sequencing),并对同一样本,平行地采取使用和不适用poly-A筛选的方法。最终结果说明,省略该环节是合适的,虽然这么做可能轻微降低文库复杂度,但它能更有效避免该筛选环节带来的其他弊端,如需要更多RNA起始量、容易倾向地筛选出具有更长poly-A尾巴的mRNA、会导致差异表达基因也受到影响而更不稳定等。
IF:3.500Q2 BMC genomics, 2022-Jul-22. DOI: 10.1186/s12864-022-08762-8 PMID: 35869428
Abstract:
BACKGROUND: Genome-wide RNA-sequencing technologies are increasingly critical to a wide variety of diagnostic and research applications. RNA-seq users often first enrich for mRNA, with the most popular enrichment method being … >>>
BACKGROUND: Genome-wide RNA-sequencing technologies are increasingly critical to a wide variety of diagnostic and research applications. RNA-seq users often first enrich for mRNA, with the most popular enrichment method being poly(A) selection. In many applications it is well-known that poly(A) selection biases the view of the transcriptome by selecting for longer tailed mRNA species.RESULTS: Here, we show that poly(A) selection biases Oxford Nanopore direct RNA sequencing. As expected, poly(A) selection skews sequenced mRNAs toward longer poly(A) tail lengths. Interestingly, we identify a population of mRNAs (> 10% of genes' mRNAs) that are inconsistently captured by poly(A) selection due to highly variable poly(A) tails, and demonstrate this phenomenon in our hands and in published data. Importantly, we show poly(A) selection is dispensable for Oxford Nanopore's direct RNA-seq technique, and demonstrate successful library construction without poly(A) selection, with decreased input, and without loss of quality.CONCLUSIONS: Our work expands the utility of direct RNA-seq by validating the use of total RNA as input, and demonstrates important technical artifacts from poly(A) selection that inconsistently skew mRNA expression and poly(A) tail length measurements. <<<
翻译
3.
颜林林 (2022-07-06 00:02):
#paper doi:10.1186/s12864-022-08717-z BMC Genomics, 2022, The effects of sequencing depth on the assembly of coding and noncoding transcripts in the human genome. 众所周知,测序深度会影响其数据的分析结果。然而,到底影响多大,怎么影响的,往往视研究目的和研究对象而定,得具体分析,也值得研究。这篇文章,就是在系统研究测序深度对转录组数据的转录本组装的影响。文章纳入了来自150个人类干细胞样本的不同细胞组织的RNA-seq数据,除了短读长平台外,还包括四个PacBio平台的长读长数据。其中有两个样本还测了高达200M reads的NGS数据量,于是可以用它们来抽取不同比例数据,以模拟不同的测序数据量。分析结果表明,编码转录本与非编码转录本之间存在差异,前者随着测序深度增加而迅速进入饱和,后者在所分析的数据中则几乎始终未达到饱和。这可能与两者的组装难度有关。此外,长读长信息有助于含有转座元件的转录本组装。比较有意思的是单细胞RNA-seq(scRNA-seq),其非编码转录本的表达水平低,是由于表达细胞较少,而在表达的细胞中,非编码转录本的表达水平其实与编码转录本相似,这个现象的发现得益于长读长测序平台,因此文章得出结论是长读长测序更适合scRNA-seq。但我个人多少还是怀疑这些结论很可能与分析评估方法有关,也许值得重复下这篇文章的分析过程。
IF:3.500Q2 BMC genomics, 2022-Jul-04. DOI: 10.1186/s12864-022-08717-z PMID: 35787153
Abstract:
Investigating the functions and activities of genes requires proper annotation of the transcribed units. However, transcript assembly efforts have produced a surprisingly large variation in the number of transcripts, and … >>>
Investigating the functions and activities of genes requires proper annotation of the transcribed units. However, transcript assembly efforts have produced a surprisingly large variation in the number of transcripts, and especially so for noncoding transcripts. This heterogeneity in assembled transcript sets might be partially explained by sequencing depth. Here, we used real and simulated short-read sequencing data as well as long-read data to systematically investigate the impact of sequencing depths on the accuracy of assembled transcripts. We assembled and analyzed transcripts from 671 human short-read data sets and four long-read data sets. At the first level, there is a positive correlation between the number of reads and the number of recovered transcripts. However, the effect of the sequencing depth varied based on cell or tissue type, the type of read and the nature and expression levels of the transcripts. The detection of coding transcripts saturated rapidly with both short and long-reads, however, there was no sign of early saturation for noncoding transcripts at any sequencing depth. Increasing long-read sequencing depth specifically benefited transcripts containing transposable elements. Finally, we show how single-cell RNA-seq can be guided by transcripts assembled from bulk long-read samples, and demonstrate that noncoding transcripts are expressed at similar levels to coding transcripts but are expressed in fewer cells. This study highlights the impact of sequencing depth on transcript assembly. <<<
翻译
4.
颜林林 (2022-06-05 06:41):
#paper doi:10.1186/s12864-022-08435-6 BMC Genomics, Frameshift and wild-type proteins are often highly similar because the genetic code and genomes were optimized for frameshift tolerance. 在基因组和分子演化的研究中,我们通常认为(非3整倍长度的)移码突变会造成蛋白功能的完全丧失,而关于密码子表的容错能力通常也局限在第三个碱基上。这篇文章从方法上看,是一项非常经典的纯生物信息学研究,它基于公共数据和序列分析方法,对移码突变的特性进行分析,发现并验证了移码突变后的蛋白,与突变前野生型的蛋白,在序列和氨基酸理化性质等各方面,是保留有一定相似性的。这种保留,与完全随机的突变相比,是存在显著差异的。从而证明了各物种的基因组序列,以及密码子表,在对移码突变的容错方面,是“经过优化”的。这为密码子表的演化形成提供了新的角度及思路。在拥有大量公开生物序列数据的今天,充分利用这些数据,基于少量简单合理的假设前提,辅以诸如序列分析这样的生信基础技术和相应的统计检验过程,来回答一些基础生物学问题,做得比较认真和扎实,我个人很喜欢这样的研究工作。
IF:3.500Q2 BMC genomics, 2022-Jun-02. DOI: 10.1186/s12864-022-08435-6 PMID: 35655139
Abstract:
Frameshift mutations have been considered of significant importance for the molecular evolution of proteins and their coding genes, while frameshift protein sequences encoded in the alternative reading frames of coding … >>>
Frameshift mutations have been considered of significant importance for the molecular evolution of proteins and their coding genes, while frameshift protein sequences encoded in the alternative reading frames of coding genes have been considered to be meaningless. However, functional frameshifts have been found widely existing. It was puzzling how a frameshift protein kept its structure and functionality while substantial changes occurred in its primary amino-acid sequence. This study shows that the similarities among frameshifts and wild types are higher than random similarities and are determined at different levels. Frameshift substitutions are more conservative than random substitutions in the standard genetic code (SGC). The frameshift substitutions score of SGC ranks in the top 2.0-3.5% of alternative genetic codes, showing that SGC is nearly optimal for frameshift tolerance. In many genes and certain genomes, frameshift-resistant codons and codon pairs appear more frequently than expected, suggesting that frameshift tolerance is achieved through not only the optimality of the genetic code but, more importantly, the further optimization of a specific gene or genome through the usages of codons/codon pairs, which sheds light on the role of frameshift mutations in molecular and genomic evolution. <<<
翻译
回到顶部