颜林林
(2022-07-02 00:24):
#paper doi:10.1186/s12859-022-04798-5 BMC Bioinformatics, 2022, DeepPN: a deep parallel neural network based on convolutional neural network and graph convolutional network for predicting RNA-protein binding sites. 识别RNA与蛋白的结合位点(RBP),是研究基因调控的重要内容。传统采用免疫沉淀等方法进行高通量的筛选和测定,但实验方法存在诸多局限,故人们尝试开发了许多计算工具来预测RBP,这其中大多为根据序列和结构信息进行数学计算的方法。深度学习技术,由于能够自动根据数据学习到重要且复杂的隐藏特征,因此也逐步被应用到这个问题上来。本文的研究,在考虑深度学习技术时,采用了图卷积网络(GCN)中的ChebNet。该方法过去多被用于光谱数据,且近年的研究在fMRI、图像语义分割等领域也都取得不错效果。于是本文基于CNN和ChebNet搭建了一个名为DeepPN的并行深度神经网络,并在24个真实数据集上进行测试,效果优于其他同类方法。推测可能是由于本文方法利用了统计频率来补充特征,因此取得了更好的性能。
DeepPN: a deep parallel neural network based on convolutional neural network and graph convolutional network for predicting RNA-protein binding sites
翻译
Abstract:
BACKGROUND: Addressing the laborious nature of traditional biological experiments by using an efficient computational approach to analyze RNA-binding proteins (RBPs) binding sites has always been a challenging task. RBPs play a vital role in post-transcriptional control. Identification of RBPs binding sites is a key step for the anatomy of the essential mechanism of gene regulation by controlling splicing, stability, localization and translation. Traditional methods for detecting RBPs binding sites are time-consuming and computationally-intensive. Recently, the computational method has been incorporated in researches of RBPs. Nevertheless, lots of them not only rely on the sequence data of RNA but also need additional data, for example the secondary structural data of RNA, to improve the performance of prediction, which needs the pre-work to prepare the learnable representation of structural data.RESULTS: To reduce the dependency of those pre-work, in this paper, we introduce DeepPN, a deep parallel neural network that is constructed with a convolutional neural network (CNN) and graph convolutional network (GCN) for detecting RBPs binding sites. It includes a two-layer CNN and GCN in parallel to extract the hidden features, followed by a fully connected layer to make the prediction. DeepPN discriminates the RBP binding sites on learnable representation of RNA sequences, which only uses the sequence data without using other data, for example the secondary or tertiary structure data of RNA. DeepPN is evaluated on 24 datasets of RBPs binding sites with other state-of-the-art methods. The results show that the performance of DeepPN is comparable to the published methods.CONCLUSION: The experimental results show that DeepPN can effectively capture potential hidden features in RBPs and use these features for effective prediction of binding sites.
翻译
Keywords: