CACHE

CRITICAL ASSESSMENT OF COMPUTATIONAL HIT-FINDING EXPERIMENTS

DONATE

  • About
    • WHAT IS CACHE
    • Conferences
  • CACHE News
  • CHALLENGES
    • Challenge #1
      • Announcement
      • Computation methods
      • Preliminary results
      • Final results
    • Challenge #2
      • Announcement
      • Computation methods
      • Preliminary results
      • Final Results
    • Challenge #3
      • Announcement
      • Computation methods
      • Preliminary results
      • Final Results
    • Challenge #4
      • Announcement
      • Computation methods
      • Preliminary results
    • Challenge #5
      • Announcement
      • Computation methods
    • Challenge #6
      • Announcement
    • FAQ
  • PUBLICATIONS
  • CONTACT

Challenge #3

Hit Identification
Method type (check all that applies)
Free energy perturbation
High-throughput docking
Machine learning
Physics-based
Description of your approach (min 200 and max 800 words)

The small molecule libraries will be obtained from the ZINC and Mcule purchasable databases and common filters will be applied to remove duplicates. Additionally, an in-house Evolutionary chemical binding similarity (ECBS) method (PMID: 31504818) will be used for the primary virtual screening of the curated database. The ECBS method is designed to encode molecular features enriched in evolutionarily conserved chemical-target binding relationships, and is based on the likelihood of chemical compounds binding to identical targets. The collected chemical pairs, target, and evolutionary information were used to build the ECBS model through classification similarity learning. In particular, an ECBS model was developed to classify ERCPs (Evolutionarily Related Chemical Pairs) from ‘unrelated chemical pairs’ and the output values of the ECBS model represent chemical similarity score prioritizing the selection of ERCPs.

TS-ensECBS model (Target-Specific ensemble model) is specifically trained to recognize chemical pairs that bind to a given virtual screening (VS) target and is therefore used for VS between different ECBS models. The TS-ensECBS model defines only the ERCP of target that are evolutionarily linked to the VS targets and integrates multiple ECBS models based on definitions of evolutionary information about the VS targets to reflect various evolutionary information. The TS-ensECBS model assigns a similarity score between 0 and 1 to each chemical. The higher the similarity, the more likely it is to bind to the VS target. Our previous work includes the description for the ECBS models (PMID:31504818).

The top-scoring compounds from TS-ensECBS model will be used for virtual screening with AutoDock-Vina, AutoDock4.2 and DOCK 6.2. The crystal structures of PDBIDs 6Z5T, 6W02, and 7BF5 will be used for docking procedure. Subsequently, the docked complexes will be used for molecular dynamics simulations using AMBER software. Using RASPD+, protein-ligand binding affinity calculations are performed. After molecules are selected based on consensus scores from both docking and RASPD+, they will be subjected to clustering to find the most common substructures among them. To select the binding pose for the calculation of the final state binding free energy, the docking poses of the compounds shortlisted by both methods are scored using a scoring function to determine the protein-ligand interaction. For choosing the final chemicals for experimental validation, pairwise chemical similarity score, visual inspection and binding free energy scores will be considered. To ensure that the top hits satisfy ADME properties, QED scores, and synthesizability, we will apply the filters using SwissADME tool.

What makes your approach stand out from the community? (<100 words)

The popular QSAR-based ligand similarity method, which ignores data on multiple target-chemical interactions, only employs compounds that bind to a single target protein, although it is used in ECBS. In ECBS, different levels of evolutionary information about targets-such as motif, domain, family, and superfamily-were encoded into molecular binding similarity. The integration of ECBS method for initial screening and the widely used drug discovery approaches for secondary screening can facilitate the identification of high affinity Nsp3 hits.

Method Name
Evolutionary chemical binding similarity (ECBS) method
Free software packages used

RDKit, AutoDock VINA, AutoDock, DOCK, RASPD+, AMBER, SwissADME

Relevant publications of previous uses by your group of this software/method

Keunwan Park, Young-Joon Ko, Prasannavenkatesh Durai, Cheol-Ho Pan, Machine learning-based chemical binding similarity using evolutionary relationships of target genes, Nucleic Acids Research, Volume 47, Issue 20, 18 November 2019, Page e128, https://doi.org/10.1093/nar/gkz743

Hit Optimization Methods
Method type (check all that applies)
De novo design
Deep learning
Free energy perturbation
High-throughput docking
Machine learning
Physics-based
Description of your approach (min 200 and max 800 words)

TS-ensECBS method will be update with the new experimental data to further optimize the model. The molecules from the ZINC and Mcule purchasable databases molecules will be screened using the most recent TS-ensECBS model, and top-ranked molecules, undergo an additional process that followed previous hit identification. Another strategy uses the REINVENT/DeepFrag tool to create new virtual molecules with multiple desired properties based on structural information from hit molecules. REINVENT combines generative models, reinforcement learning, and improved scoring capabilities to allow the collection of molecules with different parameters as rewards. Subsequently, using DeepFrag tool novel fragments will be generate to identified hit with existing moieties. Further, the molecules will be processed with docking to visually inspect the interactions with binding site residues in order to increase the binding affinity.

The ChEMBL dataset will be used primarily to train the generative model. The generative model is then subjected to transfer learning using the identified hit molecules and used for sampling. DockStream methods will be integrated into REINVENT to obtain compounds with high molecular docking scores, improved QED scores, and synthesizable quality. Increase the docking score weight to emphasize the suitability of direct receptor-ligand interactions. The core structures of the hit and output molecules are identical, making it possible to calculate the relative binding free energies of ligand pairs.

What makes your approach stand out from the community? (<100 words)

The TS-ensECBS is updated with new SAR data from hit identification results to improve prediction accuracy. By considering both active and inactive compounds from the previous identification process, TS-ensECBS combines with conventional methods to provide purchasable candidates. However, the focus of generative modeling approaches is to modify the core structure of previously discovered hits to generate synthesizable compounds with all the necessary properties required to bind to Nsp3.

Method Name
target-specific ensemble model (TS-ensECBS)
Commercial software packages used

BIOVIA Discovery Studio Client

Free software packages used

Gromacs, pmx, RDKit, AutoDock VINA, AutoDock, gmx_MMPBSA, AMBER,BAT, DeepFrag, SwissADME

Relevant publications of previous uses by your group of this software/method

Keunwan Park, Young-Joon Ko, Prasannavenkatesh Durai, Cheol-Ho Pan, Machine learning-based chemical binding similarity using evolutionary relationships of target genes, Nucleic Acids Research, Volume 47, Issue 20, 18 November 2019, Page e128, https://doi.org/10.1093/nar/gkz743

Cache

All rights reserved
v5.47.19.49

Footer first

  • Login
  • Applicant Login
  • Terms of Participation
  • Privacy Policy
  • FAQ
  • Docs
This website is licensed under CC-BY 4.0

Toronto website development by Rebel Trail