The in-stock 3D molecules from the ZINC20 database or Mcule Purchasable molecules will be subjected to common filters after duplicates are removed and conformers will be generated.
We will apply our proprietary ECBS (Evolutionary chemical binding similarity) method (PMID: 31504818) for primary virtual screening of the curated database. Based on the likelihood that compounds may bind to related targets, ECBS evaluates chemical similarity. Conserved sequences in ligand binding sites can be found in evolutionarily related proteins. Therefore, when a chemical binds to a target, there is a probability that it may also bind to targets that are evolutionarily related.
The evolutionarily related chemical pairs (ERCPs) and unrelated pairs are distinguished using a binary classifier in the ECBS model, which is based on classification similarity-learning. When the targets of the chemical pairs are the same or share an evolutionary annotation, the chemical pairs are said to be evolutionarily related. The TS-ensECBS model will be utilized for virtual screening (VS) among various ECBS models, because it will be specifically trained to detect chemical pairs binding to a predetermined VS target. The TS-ensECBS model only defines ERCPs from targets that are evolutionarily linked to VS targets, and it integrates multiple ECBS models based on different definitions of evolutionary information about the VS target to reflect a variety of evolutionary information. The TS-ensECBS model assigns each chemical a similarity score between 0 and 1, with a higher similarity representing a higher possibility of binding to the VS target. Our prior work contains the guiding concepts and comprehensive model construction process for the ECBS models (PMID: 31504818).
The highest TS-ensECBS scoring compounds will go through molecular docking with AutoDock VINA and AutoDock. The RNA binding chains from crystal structures in the IDs 7CXM, 5RLH, 5RLZ, 5RML, and 5RMM will be used for docking procedure. After molecules being chosen based on consensus scores from both docking methods, they will be subjected to clustering and finding most common substructure among them. In order to choose a binding pose for end-state binding free energy calculations, the docking poses of shortlisted compounds from both methods will be evaluated with scoring functions to assess protein-ligand interactions. For choosing the final chemicals for experimental validation, pairwise chemical similarity score, visual inspection and binding free energy scores will be considered.