CACHE

CRITICAL ASSESSMENT OF COMPUTATIONAL HIT-FINDING EXPERIMENTS

DONATE

  • About
    • WHAT IS CACHE
    • Conferences
  • CACHE News
  • CHALLENGES
    • Challenge #1
      • Announcement
      • Computation methods
      • Preliminary results
      • Final results
    • Challenge #2
      • Announcement
      • Computation methods
      • Preliminary results
      • Final Results
    • Challenge #3
      • Announcement
      • Computation methods
      • Preliminary results
      • Final Results
    • Challenge #4
      • Announcement
      • Computation methods
      • Preliminary results
    • Challenge #5
      • Announcement
      • Computation methods
    • Challenge #6
      • Announcement
    • FAQ
  • PUBLICATIONS
  • CONTACT

Challenge #2

Hit Identification
Method type (check all that applies)
Deep learning
Free energy perturbation
High-throughput docking
Physics-based
Hybrid of the above
Multi-stage Machine Learning and physics-based hit-finding pipeline
Description of your approach (min 200 and max 800 words)

Our proposed pipeline consists of four steps. As a preliminary step, because of the four similar protein PDBs for this CACHE challenge, we will run unrestrained MD simulations for all four PDB structures and compare the resulting Boltzmann Distributions. If no major differences can be found, we will limit further steps to PDB 5RLZ.

To begin molecule ranking, we will first pass the on-demand ZINC20 database of small molecules through a custom docking pipeline using a machine learning-enhanced consensus module. This consensus module combines binding affinity and pose predictions from three traditional docking tools to rank the small molecules according to the probability of binding. This method outperforms individual docking tools on test datasets like DUDE and LIT-PCBA in the BEDROC metric.

Second, binding energies for the top 5000 docking posed will be calculated using the GFN2-xTB quantum chemistry (QM) method. In this method, the protein-ligand complex's geometry is optimized then the single point energies of bound and unbound conformations are compared. This method provides a good tradeoff between computational efficiency and accuracy and has been shown to be well-suited for exploring the conformational space of molecular systems.

Third, we will evaluate the 300-500 (dependent on computational resources) most promising candidates from the QM method through Umbrella Sampling combined with Hamiltonian Replica Exchange. We will use 40 Umbrella windows for a total of 45 ns simulations per window. We will then employ WHAM to calculate the free energy profiles. We plan to use OpenForceField parameterizations for the ligands, Amber ff19SB for the Protein, and run the simulations with Gromacs/OpenMM. These MD simulations will also provide a measure of ligand stability in the binding site, as measured by ligand fluctuations in terms of root-mean-square-fluctuation (RMSF).

Finally, the docking scores, Umbrella Sampling free energies, MD-derived RMSF scores, and the GFN2-xTB point energies will be aggregated as summed z-scores to assign a final score to each of the top ~300 molecules. The 100 best ranked molecules will be selected for experimental validation.

What makes your approach stand out from the community? (<100 words)

Our method combines various industry best-practices to produce an efficient and comprehensive hit-finding pipeline. The combination of machine learning and physics-based methods provides the best trade-off between accuracy and efficiency in determining the most likely drug candidates. Comparing a variety of metrics allows for a more comprehensive analysis of each ligand and helps us to more confidently assess which compounds will yield positive experimental results. 

 

Method Name
CMOD Design
Commercial software packages used

Gaussian

Free software packages used

OpenMM, OpenForceField, Gromacs, MDAnalysis, AmberTools, Autodock Vina, Ledock, Plants, internally developed machine learning models (MILCDock)

Relevant publications of previous uses by your group of this software/method

This method has not been published by our group, but builds in separate developments and established best practices that we plan to combine for the CACHE challenge. Related publications include:

1. Learning Small Molecule Energies and Interatomic Forces with an Equivariant Transformer on the ANI-1x Dataset;
Bryce Hedelius‡, Fabian B. Fuchs, and Dennis Della Corte; ELLIS Machine Learning for Molecule Discovery Workshop (2021).

2. Engineering and application of a biosensor with focused ligand specificity
Dennis Della Corte, Hugo L. van Beek, Falk Syberg, Marcus Schallmey, Felix Tobola, Kai U. Cormann, ... Connor J. Morris‡, ... (8 other authors); Nat. Commun. 11 (1), 4851 (2020).

3. Context-dependent stabilizing interactions among solvent-exposed residues along the surface of a trimeric helix bundle;
Kimberlee L. Stern†, Mason S. Smith†, Wendy M. Billings‡, Taylor J. Loftus‡, Benjamin M. Conover‡, Dennis Della Corte, and Joshua L. Price; Biochemistry 59 (17), 1672-1679 (2020).

4. Evaluation of Deep Neural Network ProSPr for Accurate Protein Distance Predictions on CASP14 Targets;
Jacob Stern†, Bryce Hedelius‡, Olivia Fisher‡, Wendy M. Billings‡, and Dennis Della Corte; Int. J. Mol. Sci. 22, 12835 (2021).

5. The whole is greater than its parts: ensembling improves protein contact prediction
Wendy M. Billings‡, Connor J. Morris‡, and Dennis Della Corte; Sci. Rep. 11 (1), 8039 (2021).

6. Using molecular docking and molecular dynamics to investigate protein-ligand interactions
Connor J. Morris‡ and Dennis Della Corte; Mod. Phys. Lett. B 35 (8), 2130002 (2021).

7. Integrated NMR, Fluorescence and MD Benchmark Study of Protein Mechanics and Hydrodynamics;
Christina Möckel, Jakub Kubiak, Oliver Schillinger, Ralf Kuehnemuth, Dennis Della Corte, Gunnar F. Schröder, ... (4 other authors); J. Phys. Chem. B 123 (7), 1453-1480 (2018).

 

Cache

All rights reserved
v5.47.19.49

Footer first

  • Login
  • Applicant Login
  • Terms of Participation
  • Privacy Policy
  • FAQ
  • Docs
This website is licensed under CC-BY 4.0

Toronto website development by Rebel Trail