A Multi-task Domain-adapted Model to Predict Chemotherapy Response from Mutations in Recurrently Altered Cancer Genes (iScience, March 2025)

Aishwarya Jayagopal1, 7 ∙ Robert J. Walsh2,7 ∙ Krishna Kumar Hariprasannan1 ∙ Ragunathan Mariappan1 ∙ Debabrata Mahapatra6 ∙ Patrick William Jaynes3 ∙ Diana Lim4 ∙ David Shao Peng Tan2,3,5 ∙ Tuan Zea Tan3 ∙ Jason J. Pitt3 ∙ Anand D. Jeyasekharan2,3   Vaibhav Rajan1, 8

1. Department of Information Systems and Analytics, School of Computing, National University of Singapore, Singapore 117417, Singapore
2. Department of Haematology-Oncology, National University Cancer Institute, NUHS Tower Block, Level 7, 1E Kent Ridge Road, Singapore 119228, Singapore
3. Cancer Science Institute of Singapore, National University of Singapore, Center for Translational Medicine, 14 Medical Drive, #12-01, Singapore 117599, Singapore
4. Department of Pathology, National University Health System, 1E Kent Ridge Road Singapore 119228, Singapore
5. Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore. 1E Kent Ridge Road, NUHS Tower Block, Level 10, Singapore 119228, Singapore
6. Department of Computer Science, School of Computing, National University of Singapore, Singapore 117417, Singapore
7. These authors contributed equally
8. Lead contact
Next-generation sequencing (NGS) is increasingly utilized in oncological practice; however, only a minority of patients benefit from targeted therapy. Developing drug response prediction (DRP) models is important for the “untargetable” majority. Prior DRP models typically use whole-transcriptome and whole-exome sequencing data, which are clinically unavailable. We aim to develop a DRP model toward the repurposing of chemotherapy, requiring only information from clinical-grade NGS (cNGS) panels of restricted gene sets. Data sparsity and limited patient drug response information make this challenging. We firstly show that existing DRPs perform equally with whole-exome versus cNGS (∼300 genes) data. Drug IDentifier (DruID) is then described, a DRP model for restricted gene sets using transfer learning, variant annotations, domain-invariant representation learning, and multi-task learning. DruID outperformed state-of-the-art DRP methods on pan-cancer data and showed robust response classification on two real-world clinical datasets, representing a step toward a clinically applicable DRP tool.