Computational methods

Hit Identification

Method type (check all that applies)

De novo design

High-throughput docking

Machine learning

Hybrid of the above

Multitask machine learning with high-throughput docking scoring assessment

Description of your approach (min 200 and max 800 words)

Decrypting orphan GPCR drug discovery via multitask learning

W. C. Huang, W. T. Lin, M. S. Hung, J. C. Lee and C. W. Tung

J Cheminform 2024 Vol. 16 Issue 1 Pages 10

Accession Number: 38263092 PMCID: PMC10804799 DOI: 10.1186/s13321-024-00806-3

The drug discovery of G protein-coupled receptors (GPCRs) superfamily using computational models is often limited by the availability of protein three-dimensional (3D) structures and chemicals with experimentally measured bioactivities. Orphan GPCRs without known ligands further complicate the process. To enable drug discovery for human orphan GPCRs, multitask models were proposed for predicting half maximal effective concentrations (EC50) of the pairs of chemicals and GPCRs. Protein multiple sequence alignment features, and physicochemical properties and fingerprints of chemicals were utilized to encode the protein and chemical information, respectively. The protein features enabled the transfer of data-rich GPCRs to orphan receptors and the transferability based on the similarity of protein features. The final model was trained using both agonist and antagonist data from 200 GPCRs and showed an excellent mean squared error (MSE) of 0.24 in the validation dataset. An independent test using the orphan dataset consisting of 16 receptors associated with less than 8 bioactivities showed a reasonably good MSE of 1.51 that can be further improved to 0.53 by considering the transferability based on protein features. The informative features were identified and mapped to corresponding 3D structures to gain insights into the mechanism of GPCR-ligand interactions across the GPCR family. The proposed method provides a novel perspective on learning ligand bioactivity within the diverse human GPCR superfamily and can potentially accelerate the discovery of therapeutic agents for orphan GPCRs.

What makes your approach stand out from the community? (<100 words)

Our approach using multitask learning to extract common ligand recognition patterns from known ligand-target pairs showed promising performance for predicting half maximal effective concentration (EC50) of ligands. By integrating protein and chemical features, the developed prediction model offers a novel approach to decrypt the hidden messages of pair bioactivities between ligand and orphan GPCRs. In addition to the feature selection optimized multitask learning model, the high-throughput MCHR1 protein-ligand docking will be integrated with multitask learning using multi-objective optimization, enabling our approach to leverage the strengths of each method.

Method Name

Multitask learning with high-throughput docking scoring assessment

Commercial software packages used

Microsoft packages may be used in reporting or publication.

Free software packages used

Programs were developed in the Ubuntu 20.04.3 operating system using Python programming language version 3.7.11. The study used several Python packages, including numpy, pandas, matplotlib, beautifulsoup4, scikit-learn, bitarray, rdkit-pypi, torch, and AutoGluon v0.5.2. These packages were used for various tasks of data manipulation, visualization, machine learning, web scraping, and deep learning, respectively. Secondly, the AutoDock Vina was employed to dock MCHR1.

Relevant publications of previous uses by your group of this software/method

1. Huang WC, Lin WT, Hung MS, Lee JC, Tung CW (2024) Decrypting orphan GPCR drug discovery via multitask learning. J Cheminform 16:10. https://doi.org/10.1186/s13321-024-00806-3

2. Chiu YW, Tung CW, Wang CC (2024) Multitask learning for predicting pulmonary absorption of chemicals. Food Chem Toxicol 185:114453. https://doi.org/10.1016/j.fct.2024.114453

3. Wang SS, Lin P, Wang CC, Lin YC, Tung CW (2023) Machine learning for predicting chemical migration from food packaging materials to foods. Food Chem Toxicol 178:113942. https://doi.org/10.1016/j.fct.2023.113942

4. Tung CW, Lin YH, Wang SS (2019) Transfer learning for predicting human skin sensitizers. Arch Toxicol 93:931-940. https://doi.org/10.1007/s00204-019-02420-x

Challenge #5