Challenge #1

Hit Identification

Method type (check all that applies)

High-throughput docking

Machine learning

Other (specify)

Lean-Docking

Description of your approach (min 200 and max 800 words)

To identify central cavity binders of WD40 repeat (WDR) domain of LRRK2, we will follow a multi-strategy approach to increase the probability of finding potent hits. The current release of Enamine-REAL database with ~4B compounds will be screened using each strategy and top-ranking compounds will be selected. Specifically, we propose to employ 4 strategies and approximately 25 compounds from each strategy will be selected. (1) Hybrid LB-SBVS: In this strategy, Enamine-REAL database will be screened for compounds with shape and electrostatic potential similarity with compounds known to bind central-pocket of WD40 repeat domain containing proteins such as WDR5, EED. Some highly potent compounds include OICR-9429, DDO-2213, EED-A-395, EED226 etc. Compounds with similarity above a chosen cutoff would be docked to central cavity of LRRK2 WD40 repeat domain. All compounds will be ranked based on docking scores and 25 compounds will be selected. (2) Lean-docking: Here, we will utilize our recently developed approach to screen ultra-large compound libraries. Briefly, we will dock only a small subset of Enamine-REAL database to central cavity of WDR domain. The docking scores for this subset will be then used to train regressors capable of predicting docking scores of full Enamine-REAL library. The top 100K compounds will be then redocked and best 25 compounds will be selected. (3) Covalent docking: In this strategy, we will follow covalent-docking approach to identify covalent binders of cysteine residues (C2154, C2201, C2247, C2250, C2302 and C2418) lining the central cavity. Initially, Enamine-REAL database will be filtered for compounds with cysteine-focused warhead such as Acrylamides, Chloroacetamides, Acrylonitriles, Disulfides etc. Resulting cysteine-focused library will be covalently docked to WDR domain of LRRK2 targeting one cysteine at a time. All docking scores will be merged, and 25 best compounds will be selected. (4) LB-SBVS: Here, a focused library of putative WDR binders will be created. We will first gather all known central-cavity binders of the WD40 repeat domain containing proteins (N ligands). We will add to this set of ligands a "random chemical background" of size 20*N; randomly drawn from ChEMBL and excluding the previous N ligands. Molecules will be encoded using a counted atom pairs fingerprint and a L2-regularized logistic regressor will be trained. We will use this classifier to predict possible binders of the WD40 domain. Compounds in this library will be prioritized by docking them to central cavity of LRRK2 WDR and 25 compounds will be selected.

Method Name

ToKenVS

Commercial software packages used

OpenEye Software, Schrodinger Suite

Free software packages used

RDKit, FTMap, MayaChemTools, PyMol, R program

Relevant publications of previous uses by your group of this software/method

(1) Berenger, F., Kumar, A., Zhang, K. Y. J. & Yamanishi, Y. (2021). Lean-docking: Exploiting ligands’ predicted docking scores to accelerate molecular docking. J Chem Inf Model, 61(5), 2341–2352. (2) Jiang, X., Kumar, A., Motomura, Y., Liu, T., Zhou, Y., Moro, K., Zhang, K. Y. J. & Yang, Q. (2020). A series of compounds bearing a dipyrido-pyrimidine scaffold acting as novel human and insect pest chitinase inhibitors. J Med Chem, 63(3), 987–1001. (3) Berenger, F., & Yamanishi, Y. (2020). Ranking Molecules with Vanishing Kernels and a Single Parameter: Active Applicability Domain Included. Journal of chemical information and modeling, 60(9), 4376-4387. (4) Kumar, A. & Zhang, K. Y. J. (2018). Shape similarity guided pose prediction: Lessons from D3R grand challenge 3. J Comput Aided Mol Des, 33(1), 47–59. (5) Matsuoka, M., Kumar, A., Muddassar, M., Matsuyama, A., Yoshida, M. & Zhang, K. Y. J. (2017). Discovery of fungal denitrification inhibitors by targeting copper nitrite reductase from fusarium oxysporum. J Chem Inf Model, 57(2), 203–213. (6) Jiang, X., Kumar, A., Liu, T., Zhang, K. Y. J. & Yang, Q. (2016). A novel scaffold for developing specific or broad-spectrum chitinase inhibitors. J Chem Inf Model, 56 (12), 2413–2420. (7) Kumar, A., Ito, A., Hirohama, M., Yoshida, M. & Zhang, K. Y. (2016). Identification of new sumo activating enzyme 1 inhibitors using virtual screening and scaffold hopping. Bioorg Med Chem Lett, 26 (4), 1218–23. (8) Kumar, A., Ito, A., Takemoto, M., Yoshida, M. & Zhang, K. Y. (2014). Identification of 1,2,5-oxadiazoles as a new class of senp2 inhibitors using structure based virtual screening. J Chem Inf Model, 54(3), 870–80. (9) Kumar, A., Ito, A., Hirohama, M., Yoshida, M. & Zhang, K. Y. (2014). Identification of sumoylation inhibitors targeting a predicted pocket in ubc9. J Chem Inf Model, 54(10), 2784–93.

Virtual screening of merged selections

Method type (check all that applies)

Deep learning

High-throughput docking

Other (specify)

Shape Similarity Search

Description of your approach (min 200 and max 800 words)

As this stage is independent from the hit identification and optimization stage and required to predict actives among the compounds selected and submitted by all participants, here we will test our deep learning-based binding affinity prediction method. The binding affinity prediction approach will be used to rank-order compounds submitted by all participants where active compounds are expected to rank higher than the inactives. We will first predict the binding poses of all compounds to the central pocket of LRRK2 WDR domain. To predict binding pose, we will employ our previously published PoPSS methodology that employs shape alignment with existing crystal ligands of target or homologous proteins. PoPSS method identifies a ligand conformation of the highest shape similarity with target protein crystal ligands. The identified ligand conformation is then placed into the target protein binding pocket and refined using side-chain repacking and Monte Carlo energy minimization. Co-crystal structures of ligands bound to central pocket of WDR domain will be utilized. Once binding poses of all ligands are predicted, they will be scored using a deep learning-based binding affinity prediction model. Deep neural network will be developed by employing deep convolutional neural network architecture with single output neuron for predicting the biological activity. The deep neural network will be trained and tested on a dataset of about 23000 experimental protein-ligand complexes and associated biological activity from PDBbind database v2020. Database will be divided in 80:10:10 ratio based on the distribution of biological activities and chemical structure. To utilize the 3D structures of protein-ligand complexes in the deep neural network, specific transformation and encoding will be applied. Training and validation of deep neural network will be performed using Tensorflow and Keras. Model with the best prediction performance will be used to predict binding affinity of compounds submitted by all participants in Cache challenge. Rank ordering of all compounds based on deep neural network predicted binding affinity will be submitted.

Method Name

PoPSS-Net

Commercial software packages used

OpenEye Softwares

Free software packages used

RDKit, DeepChem, MayaChemTools, TensorFlow, Keras, Pymol, R

Relevant publications of previous uses by your group of this software/method

(1) Kumar, A. & Zhang, K. Y. J. (2019). Improving ligand 3d shape similarity-based pose prediction with a continuum solvent model. J Comput Aided Mol Des, 33(12), 1045–1055. (2) Kumar, A. & Zhang, K. Y. J. (2018). A cross docking pipeline for improving pose prediction and virtual screening performance. J Comput Aided Mol Des, 32, 163–173. (3) Kumar, A. & Zhang, K. Y. J. (2018). Shape similarity guided pose prediction: Lessons from D3R grand challenge 3. J Comput Aided Mol Des, 33(1), 47–59. (4) Kumar, A. & Zhang, K. Y. (2016). Application of shape similarity in pose selection and virtual screening in csardock2014 exercise. J Chem Inf Model, 56 (6), 965–73. (5) Kumar, A. & Zhang, K. Y. (2016). A pose prediction approach based on ligand 3d shape similarity. J Comput Aided Mol Des, 30(6), 457–69. (6) Kumar, A. & Zhang, K. Y. (2016). Prospective evaluation of shape similarity based pose prediction method in D3R grand challenge 2015. J Comput Aided Mol Des, 30(9), 685–693.

Hit Optimization Methods

Method type (check all that applies)

High-throughput docking

Physics-based

Other (specify)

2D and 3D similarity search, SAR

Description of your approach (min 200 and max 800 words)

The choice of hit optimization strategy to be used in this stage will depend upon the hits and from which of the hit finding strategy they came from. In general, active hits from the first stage will be used as similarity search queries to identify structurally similar compounds. Similar compounds will be identified by screening Enamine-REAL database again. These compounds will be prioritized by molecular docking, binding-free energy calculations and structure-activity relationships. In addition, compound property prediction will be performed to prioritize compounds with better physicochemical and ADMET properties. Finally, another set of 100 compounds will be selected. (1) Hit optimization strategy for covalent binder: Substructure and structural similarity search will be used to identify compounds with the same war-head as the query compound but different functional groups on the non-covalent moiety. Additionally, compounds with the same non-covalent moiety but different warhead groups of similar reactivities will also be identified. The compounds will be prioritized for second round selection based on covalent docking and quantum mechanical calculations performed to understand covalent reactivities of selected compounds. (2) Hit optimization strategy for non-covalent binder: If the active hit is a non-covalent binder, conventional hit optimization approaches will be utilized. Active compounds from the first stage will be used as queries to retrieve compounds with similar structural and chemical properties. Enamine-REAL database will be again used as screening library. Substructure search, 2D and 3D structural fingerprint matching and shape and electrostatic potential matching calculations will be carried to retrieve similar compounds. Our priority will be to identify compounds with the same scaffold as active hits. In case of unavailability of sufficient same scaffold compounds, scaffold hopping methods will be used to identify related chemical scaffolds. All compounds will be analyzed by docking them to the central pocket of LRRK2 WDR domain. A smaller set of compounds will also be analyzed using molecular dynamics simulation to calculate binding free energies. If enough active hits from the hit finding stage are available, structure-activity relationship analysis will also be performed to facilitate compound selection. Compounds with better docking scores, better binding free energies, better physicochemical and ADMET properties and forming similar interactions as the active hits will be prioritized.

Method Name

ToKenVS

Commercial software packages used

OpenEye Software, Schrodinger Suite

Free software packages used

RDKit, Gromacs, MayaChemTools, PyMol, R program