The hit identification and drug discovery strategy consist in high-throughput docking for the identification of modulators of the NSP3 helicase of SARS-CoV-2.
Large library of commercially available compounds (Enamine) will be downloaded and prepared using the LigPrep preparation workflow from Maestro (Schrödinger, Inc). Briefly, the main tautomers from each compound will be generated and only compounds with reasonable physico-chemical properties will be considered (e.g., molecular weight, HBAcc/HBDon, number of rings, log P/D, PSA, ligand flexibility, as well as number of undefined chiral centers. We will also use our proprietary filters to eliminate reactive and chemically unstable compounds, compounds with undesirable functional groups and Pan-Assay INterfeering Structures (PAINS).
The docking simulations will be performed on several PDB structures (the best representative set from 7BF5, 6Z6I, 6W02, 6Z5T, 7TX5, 7KQP, 7TWX and 6WOJ structures). These PDB structures contain small organic ligands that will help to define pharmacophore constraints during the virtual screening stage. Thus, no blind docking simulation will be performed in this challenge. Besides, additional binding pockets may also be identified using the SiteMap (Schrödinger, Inc) and MolSoft ICM Pocket Finder tools.
The virtual screening stage will consist in a several-step molecular docking workflow using the popular Glide (Schrödinger, Inc) docking tool. First, all the compounds will be docked using the fast HTVS scoring function. Then, the top 500K compounds will be docked using the SP function. The top 50K compounds will then be docking using the more accurate XP function. For each considered PDB structure, the top 20K poses from SP function and 5K poses from XP scoring functions will be finally merged before the refinement stage.
The extracted poses will be subsequently re-evaluated using the independent SeeSAR (BioSolveIT) tool using the HYDE scoring function with key structural waters added. This structure-based software will be mainly used to discard putative false positives from virtual screening (e.g., poses with either geometry or energy warnings).
We will explore application of a MM/ML approach that approximate high level ab initio binding energies with a fast machine learning approach to rescore and prioritize the final list of virtual hits for ordering.
Finally, a thorough visual inspection by experienced computational and medicinal chemists over all kept poses with the aforementioned recommendations from a variety of methods (Glide SP & XP, HYDE scores and MM/ML score that make into account structural waters) will be conducted to select the most promising compounds to be experimentally tested.