The small molecule libraries will be obtained from the ZINC and Mcule purchasable databases and common filters will be applied to remove duplicates. Additionally, an in-house Evolutionary chemical binding similarity (ECBS) method (PMID: 31504818) will be used for the primary virtual screening of the curated database. The ECBS method is designed to encode molecular features enriched in evolutionarily conserved chemical-target binding relationships, and is based on the likelihood of chemical compounds binding to identical targets. The collected chemical pairs, target, and evolutionary information were used to build the ECBS model through classification similarity learning. In particular, an ECBS model was developed to classify ERCPs (Evolutionarily Related Chemical Pairs) from ‘unrelated chemical pairs’ and the output values of the ECBS model represent chemical similarity score prioritizing the selection of ERCPs.
TS-ensECBS model (Target-Specific ensemble model) is specifically trained to recognize chemical pairs that bind to a given virtual screening (VS) target and is therefore used for VS between different ECBS models. The TS-ensECBS model defines only the ERCP of target that are evolutionarily linked to the VS targets and integrates multiple ECBS models based on definitions of evolutionary information about the VS targets to reflect various evolutionary information. The TS-ensECBS model assigns a similarity score between 0 and 1 to each chemical. The higher the similarity, the more likely it is to bind to the VS target. Our previous work includes the description for the ECBS models (PMID:31504818).
The top-scoring compounds from TS-ensECBS model will be used for virtual screening with AutoDock-Vina, AutoDock4.2 and DOCK 6.2. The crystal structures of PDBIDs 6Z5T, 6W02, and 7BF5 will be used for docking procedure. Subsequently, the docked complexes will be used for molecular dynamics simulations using AMBER software. Using RASPD+, protein-ligand binding affinity calculations are performed. After molecules are selected based on consensus scores from both docking and RASPD+, they will be subjected to clustering to find the most common substructures among them. To select the binding pose for the calculation of the final state binding free energy, the docking poses of the compounds shortlisted by both methods are scored using a scoring function to determine the protein-ligand interaction. For choosing the final chemicals for experimental validation, pairwise chemical similarity score, visual inspection and binding free energy scores will be considered. To ensure that the top hits satisfy ADME properties, QED scores, and synthesizability, we will apply the filters using SwissADME tool.