白鸟 (2026-02-28 11:15):
#paper 10.1038/s41586-024-08443-4, Nature, 2025, Multiscale footprints reveal the organization of cis-regulatory elements. 挑战:转录因子与DNA相互作用弱、瞬时、难以测量。 大多数单细胞ATAC-seq分析只关注开放染色质的peak峰。但是,仅仅因为染色质峰是开放的,并不一定意味着转录因子就结合在其中。足迹分析能较好的解决这个问题,如果转录因子与某个基因结合,在结合位点处就会出现可及性下降。 文章的作者很好的应用了足迹的优势,并通过两步创新大幅提高了足迹信号的可靠性和解释力。 第一步是PRINT方法:彻底消除单细胞ATAC数据中存在的TN5插入偏差。 Tn5转座酶有强烈的序列偏好,如果不校正,这种酶偏好会被误认为是生物学信号,导致假足迹或掩盖真实弱足迹。这在单细胞数据中尤其致命,因为每个细胞覆盖度低,噪声大。基于细菌人工染色体中去蛋白DNA的Tn5插入数据训练了一个卷积神经网络,这些BAC DNA几乎没有蛋白结合,Tn5插入完全由序列偏好驱动,几乎是“纯噪声”数据。 基于校正后的数据,PRINT再计算多尺度足迹。 第二步,seq2PRINT框架:将ATAC数据和DNA序列结合起来使用,推断TF/核小体结合。
IF:50.500Q1 Nature, 2025-2-20. DOI: 10.1038/s41586-024-08443-4 PMID: 39843737 PMCID:PMC11839466
Multiscale footprints reveal the organization of cis-regulatory elements
翻译
Abstract:
Abstract Cis-regulatory elements (CREs) control gene expression and are dynamic in their structure and function, reflecting changes in the composition of diverse effector proteins over time1. However, methods for measuring the organization of effector proteins at CREs across the genome are limited, hampering efforts to connect CRE structure to their function in cell fate and disease. Here we developed PRINT, a computational method that identifies footprints of DNA–protein interactions from bulk and single-cell chromatin accessibility data across multiple scales of protein size. Using these multiscale footprints, we created the seq2PRINT framework, which uses deep learning to allow precise inference of transcription factor and nucleosome binding and interprets regulatory logic at CREs. Applying seq2PRINT to single-cell chromatin accessibility data from human bone marrow, we observe sequential establishment and widening of CREs centred on pioneer factors across haematopoiesis. We further discover age-associated alterations in the structure of CREs in murine haematopoietic stem cells, including widespread reduction of nucleosome footprints and gain of de novo identified Ets composite motifs. Collectively, we establish a method for obtaining rich insights into DNA-binding protein dynamics from chromatin accessibility data, and reveal the architecture of regulatory elements across differentiation and ageing.
翻译
回到顶部