Jiani Ma, Lin Zhang, Jin Chen
Mar 24, 2021
Citations
1
Influential Citations
9
Citations
Journal
BMC Bioinformatics
Abstract
Background Recent studies have confirmed that N7-methylguanosine (m 7 G) modification plays an important role in regulating various biological processes and has associations with multiple diseases. Wet-lab experiments are cost and time ineffective for the identification of disease-associated m 7 G sites. To date, tens of thousands of m 7 G sites have been identified by high-throughput sequencing approaches and the information is publicly available in bioinformatics databases, which can be leveraged to predict potential disease-associated m 7 G sites using a computational perspective. Thus, computational methods for m 7 G-disease association prediction are urgently needed, but none are currently available at present. Results To fill this gap, we collected association information between m 7 G sites and diseases, genomic information of m 7 G sites, and phenotypic information of diseases from different databases to build an m 7 G-disease association dataset. To infer potential disease-associated m 7 G sites, we then proposed a heterogeneous network-based model, m 7 G Sites and Diseases Associations Inference (m 7 GDisAI) model. m 7 GDisAI predicts the potential disease-associated m 7 G sites by applying a matrix decomposition method on heterogeneous networks which integrate comprehensive similarity information of m 7 G sites and diseases. To evaluate the prediction performance, 10 runs of tenfold cross validation were first conducted, and m 7 GDisAI got the highest AUC of 0.740(± 0.0024). Then global and local leave-one-out cross validation (LOOCV) experiments were implemented to evaluate the model’s accuracy in global and local situations respectively. AUC of 0.769 was achieved in global LOOCV, while 0.635 in local LOOCV. A case study was finally conducted to identify the most promising ovarian cancer-related m 7 G sites for further functional analysis. Gene Ontology (GO) enrichment analysis was performed to explore the complex associations between host gene of m 7 G sites and GO terms. The results showed that m 7 GDisAI identified disease-associated m 7 G sites and their host genes are consistently related to the pathogenesis of ovarian cancer, which may provide some clues for pathogenesis of diseases. Conclusion The m 7 GDisAI web server can be accessed at http://180.208.58.66/m7GDisAI/ , which provides a user-friendly interface to query disease associated m 7 G. The list of top 20 m 7 G sites predicted to be associted with 177 diseases can be achieved. Furthermore, detailed information about specific m 7 G sites and diseases are also shown.