CACHE

CRITICAL ASSESSMENT OF COMPUTATIONAL HIT-FINDING EXPERIMENTS

DONATE

  • About
    • WHAT IS CACHE
    • Conferences
  • CACHE News
  • CHALLENGES
    • Challenge #1
      • Announcement
      • Computation methods
      • Preliminary results
      • Final results
    • Challenge #2
      • Announcement
      • Computation methods
      • Preliminary results
      • Final Results
    • Challenge #3
      • Announcement
      • Computation methods
      • Preliminary results
      • Final Results
    • Challenge #4
      • Announcement
      • Computation methods
      • Preliminary results
    • Challenge #5
      • Announcement
      • Computation methods
    • Challenge #6
      • Announcement
    • FAQ
  • PUBLICATIONS
  • CONTACT

Challenge #3 – COMPUTATIONAL METHODS

Here is a list of all computational methods used for hit identification in CACHE Challenge #3. Click on the Description for more details. Some participants preferred not to release their publications to stay anonymous at this time.

Description
Method name
Commercial software
Free software

Our approach called DockAI is a new technology that combines docking with a state-of-the-art active learning methodology to significantly improve the efficiency and effectiveness of virtual screening and hit identification.

Read more...
DockAI

Torch Torch serv Openbabel Rdkit Ambertools Gromacs Sander

Our approach combines the expertise of Kozakov Lab at Stony Brook and Tropsha Lab at UNC. Our workflow uses several complimentary modules for identification of high affinity hits for a given protein target with a known 3D structure. Identification of binding site hot-spot 4(8) information together with conventional structure-based virtual screening methods are enabling components of our hit selection approach.

Read more...
Frag2Hits

Glide by Schrodinger


FTMap server (https://ftmap.bu.edu/), RDKit;

We propose a generative Artificial Intelligence workflow whereby we combine a Generative Adversarial Network (GAN) and Reinforcement Learning (RL) for simultaneous hit identification and optimization.

Read more...
MERLIND: Multi-Expert Reinforcement Learning in Drug Discovery

OpenEye


 


Autodock Vina AMBER OpenMM PyTorch      

Here we adopt the hybrid of two different computational strategies for the hit identification of SARS-CoV-2 Nsp3 Mac1 inhibitors.

1. Hit identification via the large-scale virtual screening of Enamine databases

1.1 Initial virtual screening and structural filtering processes

Read more...
The combination of large-scale virtual screening and a variety of deep molecular generation methods for hit identification and optimization

Glide (Schrödinger Release)


GROMACS (for binding free energy computation) RDKit Multiwfn (for electronic wavefunction analysis) pytorch (as deep learning framework)

Our proposed hit-identification workflow extends pipelines developed by our team during  CACHE Challenge 2. The core methodology consists of high-throughput docking followed by binding affinity estimation using Molecular Mechanics Poisson-Boltzman Surface Area (MMPBSA) on multiple poses drawn from a molecular dynamics (MD) run of the protein-ligand complex.

Read more...
Reaction informed fragment fuser (RiFF)

None


AutoDock Vina AutoDock4 rDock GROMACS DOCK v6 & v3 (Kuntz Group UCSF) PyAutoFEP  

We will use structure-based ultra-large virtual screenings using VirtualFlow. Step 1: Protein preparation Protein structures will be prepared with Maestro from Schrödinger (protonation state assignment, assignment of missing atoms/side chains, hydrogen atoms, ...). MD simulations of the target protein will be carried out using Amber 18. Conformations will be clustered, and representative structures of the clusters will be used for the virtual screens.

Read more...
VirtualFlow/Ultra-Large Virtual Screens

Maestro (protein preparation)


VirtualFlow, AutoDock Vina, QuickVina, Smina, Plants, GWOVina

CODASS3
This proposal is a substantially enhanced version of our previously successful CACHE2 proposal. It includes improvements to every stage of our COmbined Docking And Similarity Search 2.0 (CODASS2) workflow that was applied to that challenge, as well as introducing additional tools and features to boost both its throughput (and thus the size of its screening library) and the reliability of its predictions. In summary, these improvements are:

 

Read more...
COmbined Docking and Similarity Search 3.0 (CODASS3)

None


Autodock, Vina-GPU+, GWOVina, RF-Score-VS v2, SCORCH2, Osiris DataWarrior, PDB2PQR, OpenBabel, RDKit, Autodock-SS, NAMD, MOPAC2016 (free to academics), Filter-it

We developed multi-scale and multi-task neural networks to learn binding structures and binding affinities between compounds and proteins based on our previous works[1-3]. The model takes geometric graph representation of compounds and proteins as input. The compound was processed by a physics-driven graph neural network, integrating the geometry and momentum information into the topological structure.

Read more...
Deep Multi-scale Learning for Drug-Protein Interaction Prediction

NA


Python, Torch, RDKIT, Biopython, P2Rank

The hit identification and drug discovery strategy consist in high-throughput docking for the identification of modulators of the NSP3 helicase of SARS-CoV-2.

Read more...
Hybrid: High-throughput docking coupled with reevaluation of top hits & docked poses

Schrodinger Drug Discovery suite, BIOVIA Pipeline Pilot, BioSolvIT, MolSoft ICM.


Open MM, RDkit.

Our approach follows multiple stages that gradually funnel massive ligand libraries into hits,  leads, and optimized leads.  The multiple stages combine earlier data-driven methods and latter principle/physics-driven methods as detailed as follows.  

 

Read more...
DeepAffinity
DeepAffinity, RDKit, AutoDock-Vina, NAMD

The Enamine REAL Database (5.5 billion compounds) will be used as target database for a deep learning-accelerated virtual screening campaign against the ADPr site of SARS-CoV-2 Nsp3 macrodomain (Mac1). First, we will remove molecules with a computed Tanimoto index of more than 0.6 from any available Mac1 ligand, in order to prioritize completely novel scaffolds.

Read more...
Deep Docking

Maestro, Glide, ICM


Deep Docking, Autodock-GPU

In our workflow we will employ 3D-pharmacophore screening to synergize the information coming from co-crystalized fragment crystal structures with the information from molecular dynamics simulations.

Read more...
Dynamic 3D Pharmacophores

InteLigand - LigandScout


CCG - MOE


Schrodinger - Desmond 


CCDC - GOLD


 


PyRod OpenMMDL RDKit KNIME Python

To identify hit molecules for the macrodomain of SARS-CoV-2 Nsp3, we will use the V-Dock approach developed by our group. The V-dock approach uses deep learning models that predict the protein-ligand docking scores from SMILES strings using the docking results of a subset of the whole library instead of directly docking all ligands. We have already shown that protein-ligand docking scores can be accurately predicted from the SMILES representations.

Read more...
SNU-Dock

Glide


RDKit Autodock-GPU Openbabel  

We  will  use  our  expertise  in  AI/ML, cheminformatics, structure-based  drug  design  (SBDD),  medicinal chemistry  to  generate  hits  for  NSP3 Macrodomain (Mac1).  Using  our  in-house  drug discovery  &  cheminformatics  platform  (published  in  scientific  literature,  proprietary  code),  we will identify a suitable subset of compounds from the Enamine Real Database using various filters which  follow medicinal  chemistry  standards  &  CACHE  white  paper  guidelines.

Read more...
Deep Learning Approach

in-house


Gromacs : If MD needed Python and Deep Learning packages: Tensorflow, Scikit, Pandas, and Numpy

The small molecule libraries will be obtained from the ZINC and Mcule purchasable databases and common filters will be applied to remove duplicates. Additionally, an in-house Evolutionary chemical binding similarity (ECBS) method (PMID: 31504818) will be used for the primary virtual screening of the curated database.

Read more...
Evolutionary chemical binding similarity (ECBS) method
RDKit, AutoDock VINA, AutoDock, DOCK, RASPD+, AMBER, SwissADME

Modular synthon-based approach - V-SYNTHES was published in Nature 601, 452–459 (2022). It first identifies the best scaffold–synthon combinations as seeds suitable for further growth, and then iteratively elaborates these seeds to select complete molecules with the best docking scores.

Read more...
V-SYNTHES

ICM-Pro is provided by MolSoft.


RDKit, KNIME

Using our Binary Star platform, we will employ a comprehensive computational protocol to enable the discovery and optimization of novel lead compounds for the ADPr site of SARS-CoV-2 Nsp3 macrodomain (Mac1). We will run a target analysis workflow (target validation and identification step) for the experimentally determined high-resolution structures of SARS-CoV-2 Nsp3 Mac1, to choose a suitable structure for the virtual screening campaign.

Read more...
A comprehensive computational protocol that includes: De novo design, high-throughput docking, MD simulations, and FEP calculations.

Schrödinger 


AMBER.


NAMD VMD CHARMM-GUI  

We propose to apply a massive library screening workflow that exhaustively screens the 4.5 billion compound Enamine REAL database using a deep-learning-based Drug Target Interaction (DTI) prediction engine to identify molecules likely to bind to SARS-CoV-2 Nsp3.

Read more...
Massive library screening using structurally-augmented Drug-Target Interaction (DTI) prediction models
RD-Kit, Vina, PyTorch

I have developed a genetic algorithm (GA) that can search Enamine Real Space and will use it to find molecules with good docking scores to the target.

Read more...
Synthon-GA

Glide


RDKit, Sython-GA, Molbloom

Our proposed pipeline consists of three steps. As a preliminary step, we will define a binding site around the ADPr site of PDB 7KQB.

Read more...
CMOD Design

Gaussian


OpenMM, OpenForceField, Gromacs, MDAnalysis, AmberTools, Autodock Vina, Ledock, Plants, internally developed machine learning models (MILCDock)

Foldit is a crowd-sourced molecular biology game. The CACHE challenge will run as a series of regular puzzles in the Foldit platform and, if prior drug design puzzle experience is any indication, will see participation levels equivalent to other Foldit puzzles.

Read more...
Drugit
Foldit/Rosetta/RDKit/ZINC API/BCL

We will identify the most conserved residues of the Nsp3 Mac1 adenine binding cavity and the proximal ribose site where there are co-crystalized fragments (54 and 9 PBD submissions respectively) and lead-like small molecules (i.e. Gahbauer et al., bioRxiv. 2022). This will be done by performing multiple sequence alignment (MSA) with the Kalign algorithm on approximately 200,000 SARS-CoV-2 Nsp3 sequences from the NCBI.

Read more...
Tiered screening incorporating molecular shape, pharmacophore features, docking, FEP and clustering
Molecular Operating Environment (MOE) by the Chemical Computing Group
Kalign In-house MoPBS pharmacophore generation software In-house VS streamlining software DataPype GROMACS

We will build on an analysis pipeline we have developed that is capable of searching billion-scale small-molecule libraries for binding candidates to a target pocket. In the first phase of the pipeline, we will perform a fast (and approximate) affinity prediction using a strategy based on graph neural networks (GNNs). We have developed GNNs that compute representations of both ligand and protein pocket based on a diverse collection of surface properties.

Read more...
PocketPackerPicker

N/A


Autodock Vina  PLIP MARTINI  

Our proposal is to link a set of fragments co-crystallized with nsp3, using either a deep generative model or a knowledge-based linker database, to afford drug-like molecules spanning at least two subpockets of the target. In a first step, 186 co-crystallized fragments (Schuller et al., Sci Adv.

Read more...
POEM (Pocket oriented elaboration of molecules)

SYBYL x2.1.1, Certara USA Inc., Princeton, U.S.A.


Szybki, Filter: OpenEye Scientific Sofware, Santa Fe, U.S.A.

IChem, DeLinker, rdkit, POEM, PLANTS

We will deploy a proprietary deep learning-based framework to rapidly screen multi-billion small molecule libraries. The performance of the proposed framework is tested on several curated as well as publicly-available unbiased benchmarking datasets. To demonstrate the actual application of the framework, we have screened 1.37 billion molecules to discover new inhibitors of the epigenetic protein BRD9 bromodomain.

Read more...
PrDIN

Maestro (protein preparation)

Glide for docking/hits prioritization (but we'll make a decision later whether to use it or Autodock4/Autpodock Vina or SMINA)


Python, TensorFlow, Keras, RDkit, Autodock4/Autdock Vina/SMINA, Gromacs (if required), PyMOL

Cache

All rights reserved
v5.47.19.49

Footer first

  • Login
  • Applicant Login
  • Terms of Participation
  • Privacy Policy
  • FAQ
  • Docs
This website is licensed under CC-BY 4.0

Toronto website development by Rebel Trail