In this CACHE challenge, we will prioritize implementing Active Learning (AL) into our workflows. Historically, we, and our software, have been limited by smaller compound libraries due to the compute-intensive framework of our virtual high throughput screening. In previous CACHE challenges, we have also identified a liability in ultimately considering smaller libraries. AL will enable the broader exploration of chemical space which we will then refine using our more traditional, higher-resolution methods. Recent CACHE findings also highlight the impact of deep learning-based scoring methods on hit rates, prompting us to incorporate a consensus scoring approach into our hit selection criteria.
Designing selective MCHR1 antagonists is especially challenging due to its similarity to the hERG channel, whose inhibition leads to cardiotoxicity; several failed clinical trials have been reported1–3. We will first explore the failures and successes of different research teams4 to help us design selective MCHR1 antagonists5.
Next, we will source MCHR1 structures from AlphaFold and GPCRdb focusing on multistate models; there are no crystal structures of MCHR1 available to-date. We will give preference to predicted structures that meet a confidence threshold: pLDDT > 706.
Since our predicted structures will be in the apo form, we will optimize the selected structure(s) by running molecular dynamics (MD) simulations after docking a representative MCHR1 antagonist into the previously proposed binding site5. In our trajectory analysis, and subsequent protein model selection, we will verify that original binding site residues (if any) with low confidence (pLDDT score between 50-70) are conformationally sound.
We will retrospectively benchmark our structural models and scoring methods (see below) with the datasets of known MCHR1 antagonists provided by the CACHE team, ensuring structure and activity diversity using either a clustering approach or careful selection. We will dock this benchmark dataset with FITTED7 using the MD optimized structures and query per-residue interaction energies for trends. We will also evaluate various scoring methods, including our own Graph Neural Network (GNN)-based scoring8, GNINA rescoring (Convolutional Neural Network (CNN)-based scoring)9, and binding free energy using an MM-GBSA method10. We will evaluate the results (AUROC, enrichment factors) to identify optimal settings, iterating if necessary. This analysis will allow us to investigate the effectiveness of consensus scoring and how individual scoring methods differentiate actives and inactives. Insights from this study will guide our prospective compounds selection.
In the second step, we will use our in-house deployed active learning (AL)-driven approach to screen the Enamine Real library. AL models iteratively predict, select, and dock compounds, build machine learning models, and refine the model predictions with each cycle to efficiently identify high-potential candidates with minimal computational effort. Throughout the AL process, we will ultimately dock only 0.01 – 0.5% of the selected virtual library11,12. Our goal is to then identify and select the top 100,000 compounds with the highest predicted docking scores for higher-resolution ranking.
After AL screening, we will generate one (or more) pharmacophore model(s) based on the insights from the retrospective study, by selecting diverse MCHR1 antagonists (preferably without hERG alerts)5. Pharmacophore-based ranking considers the 3D-shape and pharmacophoric features (acceptor, donor, and hydrophobic, so on) of known antagonists, whereas the AL-approach predicts only the global docking score based on the fingerprint similarity. Generated pharmacophore model(s) will be used to screen AL-derived 100k hits and re-rank the hits using RMSD to the reference antagonists. If the number of promising hits is low, we will select the next 100k hits from the AL screening. We envision docking at most 25,000 compounds using FITTED against the MD optimized structure.
In the third and final step, we will compute scores for the docked poses using top-performing approaches (consensus) from the benchmarking study. During the visual analysis, we will carefully examine the binding poses, key interactions, and consensus scoring to determine the final list of 150 compounds.
References:
(1) Lim, G.; You, K. Y.; Lee, J. H.; Jeon, M. K.; Lee, B. H.; Ryu, J. Y.; Oh, K.-S. Identification and New Indication of Melanin-Concentrating Hormone Receptor 1 (MCHR1) Antagonist Derived from Machine Learning and Transcriptome-Based Drug Repositioning Approaches. Int. J. Mol. Sci. 2022, 23 (7), 3807. https://doi.org/10.3390/ijms23073807.
(2) Kowalski, T. J.; Sasikumar, T. Melanin-Concentrating Hormone Receptor-1 Antagonists as Antiobesity Therapeutics: Current Status. BioDrugs Clin. Immunother. Biopharm. Gene Ther. 2007, 21 (5), 311–321. https://doi.org/10.2165/00063030-200721050-00003.
(3) Johansson, A. Evolution of Physicochemical Properties of Melanin Concentrating Hormone Receptor 1 (MCHr1) Antagonists. Bioorg. Med. Chem. Lett. 2016, 26 (19), 4559–4564. https://doi.org/10.1016/j.bmcl.2016.08.072.
(4) Johansson, A.; Löfberg, C. Novel MCH1 Receptor Antagonists: A Patent Review. Expert Opin. Ther. Pat. 2015, 25 (2), 193–207. https://doi.org/10.1517/13543776.2014.993382.
(5) Igawa, H.; Takahashi, M.; Kakegawa, K.; Kina, A.; Ikoma, M.; Aida, J.; Yasuma, T.; Kawata, Y.; Ashina, S.; Yamamoto, S.; Kundu, M.; Khamrai, U.; Hirabayashi, H.; Nakayama, M.; Nagisa, Y.; Kasai, S.; Maekawa, T. Melanin-Concentrating Hormone Receptor 1 Antagonists Lacking an Aliphatic Amine: Synthesis and Structure-Activity Relationships of Novel 1-(Imidazo[1,2-a]Pyridin-6-Yl)Pyridin-2(1H)-One Derivatives. J. Med. Chem. 2016, 59 (3), 1116–1139. https://doi.org/10.1021/acs.jmedchem.5b01704.
(6) Mariani, V.; Biasini, M.; Barbato, A.; Schwede, T. lDDT: A Local Superposition-Free Score for Comparing Protein Structures and Models Using Distance Difference Tests. Bioinformatics 2013, 29 (21), 2722–2728. https://doi.org/10.1093/bioinformatics/btt473.
(7) Moitessier, N.; Pottel, J.; Therrien, E.; Englebienne, P.; Liu, Z.; Tomberg, A.; Corbeil, C. R. Medicinal Chemistry Projects Requiring Imaginative Structure-Based Drug Design Methods. Acc. Chem. Res. 2016, 49 (9), 1646–1657. https://doi.org/10.1021/acs.accounts.6b00185.
(8) Burai-Patrascu, M.; Nivedha, A. K.; Rostaing, O.; Chukka, P.; Moitessier, N.; Pottel, J. The First CACHE Challenge–Identifying Binders of the WD-Repeat Domain of Leucine-Rich Repeat Kinase 2. ChemRxiv 2022.
(9) Andrew T. McNutt. GNINA 1.0: Molecular Docking with Deep Learning. J. Cheminformatics Vol. 13 Artic. Number 43 2021 2021, 13, 43.
(10) Yang, M.; Bo, Z.; Xu, T.; Xu, B.; Wang, D.; Zheng, H. Uni-GBSA: An Open-Source and Web-Based Automatic Workflow to Perform MM/GB(PB)SA Calculations for Virtual Screening. Brief. Bioinform. 2023, 24 (4), bbad218. https://doi.org/10.1093/bib/bbad218.
(11) Yang, Y.; Yao, K.; Repasky, M. P.; Leswing, K.; Abel, R.; Shoichet, B. K.; Jerome, S. V. Efficient Exploration of Chemical Space with Docking and Deep Learning. J. Chem. Theory Comput. 2021, 17 (11), 7106–7119. https://doi.org/10.1021/acs.jctc.1c00810.
(12) Graff, D. E.; Shakhnovich, E. I.; Coley, C. W. Accelerating High-Throughput Virtual Screening through Molecular Pool-Based Active Learning. Chem. Sci. 12 (22), 7866–7881. https://doi.org/10.1039/d0sc06805e.