当前共找到 4 篇文献分享。
1.
小年 (2024-07-31 15:47):
#paper DOI: 10.1016/j.csbj.2024.03.030 A novel framework for human leukocyte antigen (HLA) genotyping using probe capture - based targeted next - generation sequencing and computational analysis 这篇文章介绍了一种利用基于探针捕获的靶向下一代测序和计算分析进行人类白细胞抗原(HLA)基因分型的分析流程,研究团队没有使用常用的IPD-IMGT/HLA 数据库做参考而是使用了人类泛基因组参考联盟(HPRC)资源作为HLA参考的基准,丰富了HLA参考数据库,为解决传统参考存在的问题提供了新的思路。在算法方面该团队使用了五个开源软件工具(OptiType、HLA - VBseq、HISAT - genotype、SpecHLA和T1K)作比较,虽然没有单一一种软件可以做到分型100%正确,但结合使用T1K、DRAGEN和QzType三个工具进行联合分析能使HLA基因的准确率达到100%。该研究证明了HLA分型基于探针捕获的靶向测序的有效性,特别是结合集成软件分析方法,能够提高HLA分型的准确性。
Abstract:
Human leukocyte antigen (HLA) genes play pivotal roles in numerous immunological applications. Given the immense number of polymorphisms, achieving accurate high-throughput HLA typing remains challenging. This study aimed to harness … >>>
Human leukocyte antigen (HLA) genes play pivotal roles in numerous immunological applications. Given the immense number of polymorphisms, achieving accurate high-throughput HLA typing remains challenging. This study aimed to harness the human pan-genome reference consortium (HPRC) resources as a potential benchmark for HLA reference materials. We meticulously annotated specific four field-resolution alleles for 11 HLA genes (HLA, , , , , , , , , and ) from 44 high-quality HPRC personal genome assemblies. For sequencing, we crafted HLA-specific probes and conducted capture-based targeted sequencing of the genomic DNA of the HPRC cohort, ensuring focused and comprehensive coverage of the HLA region of interest. We used publicly available short-read whole-genome sequencing (WGS) data from identical samples to offer a comparative perspective. To decipher the vast amount of sequencing data, we employed seven distinct software tools: OptiType, HLA-VBseq, HISAT genotype, SpecHLA, T1K, QzType, and DRAGEN. Each tool offers unique capabilities and algorithms for HLA genotyping, allowing comprehensive analysis and validation of the results. We then compared these results with benchmarks derived from personal genome assemblies. Our findings present a comprehensive four-field-resolution HLA allele annotation for 44 HPRC samples. Significantly, our innovative targeted next-generation sequencing (NGS) approach for HLA genes showed superior accuracy compared with conventional short-read WGS. An integrated analysis involving QzType, T1K, and DRAGEN was developed, achieving 100% accuracy for all 11 HLA genes. In conclusion, our study highlighted the combination of targeted short-read sequencing and astute computational analysis as a robust approach for HLA genotyping. Furthermore, the HPRC cohort has emerged as a valuable assembly-based reference in this realm. <<<
翻译
2.
颜林林 (2023-03-02 07:38):
#paper doi:10.1016/j.csbj.2023.02.016 Computational and Structural Biotechnology Journal, 2023, DNAsmart: Multiple attribute ranking tool for DNA data storage systems. 将DNA用作存储介质,已经逐渐成为一个热门的研究方向。由于DNA在读取(测序)和写入(合成)过程中,受到其自身特性和其他环境体系不同因素的影响,存在各类错误。这篇研究提供了一个网站工具DNAsmart,以交互式的方式,可视化地展示核酸片段之间诸如GC含量、汉明距离等不同属性,帮助研究者探索如何有效利用和平衡这些属性的影响,以设计出更合适的DNA存储的编解码方案。
Abstract:
In an ever-growing need for data storage capacity, the Deoxyribonucleic Acid (DNA) molecule gains traction as a new storage medium with a larger capacity, higher density, and a longer lifespan … >>>
In an ever-growing need for data storage capacity, the Deoxyribonucleic Acid (DNA) molecule gains traction as a new storage medium with a larger capacity, higher density, and a longer lifespan over conventional storage media. To effectively use DNA for data storage, it is important to understand the different methods of encoding information in DNA and compare their effectiveness. This requires evaluating which decoded DNA sequences carry the most encoded information based on various attributes. However, navigating the field of coding theory requires years of experience and domain expertise. For instance, domain experts rely on various mathematical functions and attributes to score and evaluate their encodings. To enable such analytical tasks, we provide an interactive and visual analytical framework for multi-attribute ranking in DNA storage systems. Our framework follows a three-step view with user-settable parameters. It enables users to find the optimal en-/de-coding approaches by setting different weights and combining multiple attributes. We assess the validity of our work through a task-specific user study on domain experts by relying on three tasks. Results indicate that all participants completed their tasks successfully under two minutes, then rated the framework for design choices, perceived usefulness, and intuitiveness. In addition, two real-world use cases are shared and analyzed as direct applications of the proposed tool. DNAsmart enables the ranking of decoded sequences based on multiple attributes. In sum, this work unveils the evaluation of en-/de-coding approaches accessible and tractable through visualization and interactivity to solve comparison and ranking tasks. <<<
翻译
3.
白鸟 (2022-12-31 23:22):
#paper https://doi.org/10.1016/j.csbj.2020.06.012 Computational and Structural Biotechnology Journal 2020. Single-cell ATAC sequencing analysis: From data preprocessing to hypothesis generation. 关注点:这是一篇关于单细胞ATAC-seq分析的综述文章,比较系统地从数据的预处理到生成科学假设的过程进行了详细方法论的说明和基准测试,使用适当的软件工具和数据库,提供有价值的分析方法指导。 研究背景:与人类复杂性状相关的大多数遗传变异位于基因组非编码区域。因此,了解基因型到表型之间的生物学机理机制的研究,大多涉及基因表达的表观遗传调控。开放染色质区域的全基因组图谱可以通过顺式和反式调控元件与性状相关序列变异的关联分析,促进顺式和跨式调控元件的功能分析。ATAC-seq测序 技术,转座酶可及染色质分析被认为是染色质可及性全基因组分析中最容易获得且最具成本效益的策略。 研究不足:目前,还开发了单细胞 ATAC-seq (scATAC-seq) 技术,来研究不同异质细胞群的组织样本中细胞类型特异性染色质的可及性差异。但是,由于 scATAC-seq 数据的固有特性,高噪声和稀疏性,很难准确提取生物信号并设计有效的生物学假设。为了克服 scATAC-seq 数据分析中的这些限制,过去几年研究者开发了一些新的方法和软件工具。然而,scATAC-seq 数据分析的最佳和标准分析流程并未达成共识。 内容大纲:1.阐述scATAC-seq 分析工作流程:数据的预处理,测序read的预处理->过滤掉低质量细胞或双细胞->生成细胞-特征矩阵->多样本的批次校正和数据整合->数据转换,包括归一化->降维、可视化和聚类。以上跟scrna-seq的步骤很相似,又有其特殊性。2.scATAC-seq生成科学假设的下游分析:包括细胞类型注释,染色质可及性动力学研究,基于TF motif,基于基因,增强子,基因-疾病相关遗传变异的研究促进假说的生成。以阐明顺式调控元件(例如启动子和增强子)与反式调控元件(例如转录因子 (TF))之间的网络。还可以使用 scATAC-seq 数据分析基因活性和遗传变异的可及性。3.多模态分析:scATAC-seq 可以与单细胞 RNA 测序 (scRNA-seq) 数据 和其他组学数据相结合,用于多组学研究。这种综合多模态分析将有助于识别参与疾病进展的关键调节因子,这些调节因子通常是潜在的治疗靶点和诊断生物标志物。
Abstract:
Most genetic variations associated with human complex traits are located in non-coding genomic regions. Therefore, understanding the genotype-to-phenotype axis requires a comprehensive catalog of functional non-coding genomic elements, most of … >>>
Most genetic variations associated with human complex traits are located in non-coding genomic regions. Therefore, understanding the genotype-to-phenotype axis requires a comprehensive catalog of functional non-coding genomic elements, most of which are involved in epigenetic regulation of gene expression. Genome-wide maps of open chromatin regions can facilitate functional analysis of cis- and trans-regulatory elements via their connections with trait-associated sequence variants. Currently, Assay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-seq) is considered the most accessible and cost-effective strategy for genome-wide profiling of chromatin accessibility. Single-cell ATAC-seq (scATAC-seq) technology has also been developed to study cell type-specific chromatin accessibility in tissue samples containing a heterogeneous cellular population. However, due to the intrinsic nature of scATAC-seq data, which are highly noisy and sparse, accurate extraction of biological signals and devising effective biological hypothesis are difficult. To overcome such limitations in scATAC-seq data analysis, new methods and software tools have been developed over the past few years. Nevertheless, there is no consensus for the best practice of scATAC-seq data analysis yet. In this review, we discuss scATAC-seq technology and data analysis methods, ranging from preprocessing to downstream analysis, along with an up-to-date list of published studies that involved the application of this method. We expect this review will provide a guideline for successful data generation and analysis methods using appropriate software tools and databases for the study of chromatin accessibility at single-cell resolution. <<<
翻译
4.
cellsarts (2022-02-28 15:24):
#Paper Computational  prediction of secreted  proteins in  gram-negative bacteria #linkhttp://creativecommons.org/licenses/by-nc-nd/4.0/ https://doi.org/10.1016/j.csbj.2021.03.019   导读:革兰氏阴性菌控制多种蛋白质分泌系统,并分泌大量的蛋白质。蛋白质可以被输出到细胞周质空间,整合到细胞膜,运输到细胞外环境,或转运到与之相互接触细胞的细胞质中。本文根据革兰氏阴性菌分泌系总结了这些蛋白的已知特征,并对其预测算法和工具进行了综述。文章是基于大量的文章阅读,没有做其它的实验的验证。除通常的SEC途径及TAT途径,还介绍了之前不常见的几个途径。个人的意见-关于在革兰氏阴性细菌中,SEC途径及TAT途径仅用于内膜蛋白及细胞周质空间蛋白转运的结论下的太早了,更广的范围内的其它的实验数据,有可能会推翻那一个论断。
Abstract:
Gram-negative bacteria harness multiple protein secretion systems and secrete a large proportion of the proteome. Proteins can be exported to periplasmic space, integrated into membrane, transported into extracellular milieu, or … >>>
Gram-negative bacteria harness multiple protein secretion systems and secrete a large proportion of the proteome. Proteins can be exported to periplasmic space, integrated into membrane, transported into extracellular milieu, or translocated into cytoplasm of contacting cells. It is important for accurate, genome-wide annotation of the secreted proteins and their secretion pathways. In this review, we systematically classified the secreted proteins according to the types of secretion systems in Gram-negative bacteria, summarized the known features of these proteins, and reviewed the algorithms and tools for their prediction. <<<
翻译
回到顶部