Chromatin Interaction Neural Network (ChINN): A Machine Learning-Based Method for Predicting Chromatin Interactions from DNA Sequences. (Genome Biol, Aug 2021)

Fan Cao # 1Yu Zhang # 2Yichao Cai # 1Sambhavi Animesh 1Ying Zhang 1Semih Can Akincilar 3Yan Ping Loh 1Xinya Li 4Wee Joo Chng 1 5 6Vinay Tergaonkar 3 7Chee Keong Kwoh 2Melissa J Fullwood 8 9 10

Affiliations

1Cancer Science Institute of Singapore, National University of Singapore, 14 Medical Dr, Singapore, 117599, Singapore.
2School of Computer Science and Engineering, Nanyang Technological University, Block N4, 50 Nanyang Avenue, Singapore, 639798, Singapore.
3Institute of Molecular and Cell Biology, Agency for Science (IMCB), A*STAR (Agency for Science, Technology and Research,, Singapore, 138673, Singapore.
4School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore.
5Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, 1E Kent Ridge Road, Singapore, 119228, Singapore.
6Department of Haematology-Oncology, National University Cancer Institute, National University Health System, NUH Zone B, Medical Centre, Singapore, 119074, Singapore.
7Department of Pathology, Yong Loo Lin School of Medicine, National University of Singapore (NUS), Singapore, 117597, Singapore.
8Cancer Science Institute of Singapore, National University of Singapore, 14 Medical Dr, Singapore, 117599, Singapore. mfullwood@ntu.edu.sg.
9Institute of Molecular and Cell Biology, Agency for Science (IMCB), A*STAR (Agency for Science, Technology and Research,, Singapore, 138673, Singapore. mfullwood@ntu.edu.sg.
10School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore. mfullwood@ntu.edu.sg.

#Contributed equally.

Abstract

Chromatin interactions play important roles in regulating gene expression. However, the availability of genome-wide chromatin interaction data is limited. We develop a computational method, chromatin interaction neural network (ChINN), to predict chromatin interactions between open chromatin regions using only DNA sequences. ChINN predicts CTCF- and RNA polymerase II-associated and Hi-C chromatin interactions. ChINN shows good across-sample performances and captures various sequence features for chromatin interaction prediction. We apply ChINN to 6 chronic lymphocytic leukemia (CLL) patient samples and a published cohort of 84 CLL open chromatin samples. Our results demonstrate extensive heterogeneity in chromatin interactions among CLL patient samples.

Keywords: 3D genome organization; Bioinformatics; ChIA-PET; Chromatin interactions; DNA sequence; Hi-C; Leukemia; Machine learning.

© 2021. The Author(s).