Challenge #5 – COMPUTATIONAL METHODS
Here is a list of all computational methods used for hit identification in CACHE Challenge #5. Click on the Description for more details. Some participants preferred not to release their publications to stay anonymous at this time.
AlphaFold with MSA subsampling and possibly alternative models will be used to generate an ensemble of protein conformations. Selected conformations will be simulated with molecular dynamics to relax the system. Simulations will be analyzed to identify druggable pockets with some consistency to published data (e.g.
Read more...None.
We will use structure-based ultra-large virtual screenings using VirtualFlow 2.0 [Gorgulla 2023]. The procedure will consist of four steps.
Read more...Maestro (protein preparation)
In the first stage of virtual ligand screening, we will apply structure-based GPCR-specific QSAR model trained on the GPCR-ligand bound complexes.
Read more...Tetra-d
ICM-Pro
Our approach follows multiple stages that gradually funnel massive ligand libraries into hits, leads, and optimized leads. The multiple stages combine earlier data-driven methods and latter principle/physics-driven methods as detailed as follows.
Read more...
We will employ a structure-guided drug discovery approach based on a unique molecular generative model (SAGE) recently developed in our lab. This generative model is specifically tuned to produce Enamine REAL Space ligands targeting the 3D structure of an input protein binding pocket. Compared to traditional virtual screening, this approach offers unparalleled speed, enabling us to rapidly sample from the entire enamine library space (30 billion compounds) in a structure-guided way.
Read more...Schrodinger's Prime, Glide, Maestro
STEP 1: Protein Structure Prediction
The first part of the solution is predicting the protein structure using flow-matching methods to generate protein ensemble prediction. Molecular Dynamics Simulation will be performed on the predicted structure to ensure stability. Binding sites will be identified using our in-house binding site prediction algorithm based on few geometric deep-learning methods.
STEP 2: Training/ Finetuning different AI/ML models
Read more...Boltchem
GPCRs are the largest class of drugs targets, still the identification of novel inhibitors is hampered by the complex mechanism of action of GPCRs. In this project, we aim to identify novel MCHR1 antagonists, by building on the great success of deep learning (DL) models in drug design.
Read more...
1.Primary screening with evolutionary chemical binding similarity model
Read more...The approach will be a combination of Machine learning, Molecular docking, and Molecular Dynamics simulation.
Stage 1
We will perform the Molecular dynamics simulation with the AlphaFold homology model of MCHR1. We will use the assembly of conformations to proceed with assemble docking.
Stage 2
Read more...ICM-Pro is provided by MolSoft.
Our goal through this competition is to validate whether our enhanced hit-finding workflow demonstrates the anticipated efficacy compared to our existing workflow (CACHE challenge #4). This workflow integrates various in silico drug development techniques, from target protein structure modeling to ultra-high-throughput virtual screening and de novo design using generative models, while maintaining simplicity.
Read more...We are convinced that deep/machine learning applications to the drug discovery problem are only as good as the data they are trained on, hence our emphasis is primarily on a robust and stratified data strategy to avoid overfitting on narrow data regimes and instead promote generalization capabilities of our framework.
Read more...None.
Our approach combines the expertise of Kozakov Lab at Stony Brook and Tropsha Lab at UNC. For this specific challenge, where substantial number of molecules active against the MCHR1 target is already known, we will use both structure-based (ML-accelerated docking) and ligand based (QSAR) methods developed in our laboratories and published in the open literature; we will not use any commercial software.
Read more...None
This project will begin with a ligand-based approach by clustering the known ligands to identify groups of compounds with potentially similar interaction profiles. This will be done by generating Morgan fingerprints and calculating their Tanimoto distances, and by clustering the compounds in LigandScout by 3D pharmacophore similarity. The clusters will then be analyzed using LigandScout to generate ligand-based 3D pharmacophores.
Read more...InteLigand - LigandScout
OpenEye - ROCS
CCG - MOE
CCDC - GOLD
1 protein structure determination
Read more...In this CACHE challenge, we will prioritize implementing Active Learning (AL) into our workflows. Historically, we, and our software, have been limited by smaller compound libraries due to the compute-intensive framework of our virtual high throughput screening. In previous CACHE challenges, we have also identified a liability in ultimately considering smaller libraries.
Read more...Forecaster Suite, In-house developed packages (Active learning, and Pharmacophore modelling)
We will employ a comprehensive computational protocol to enable the discovery and optimization of novel lead compounds for melanin-concentrating hormone receptor 1 (MCHR1). Since there is no existing crystal structure, we will execute a workflow for target analysis, specifically focusing on target validation and identification. This step involves analyzing structures generated through homology modeling, utilizing tools like AlphaFold or Schrödinger software.
Read more...Schrödinger, AMBER
We had built a custom model, ligand based, to predict the IC50 of ligands. Subsequently, it will be used to screen a large number of ligands in the ENAMINE database (1 billion small molecules). The model would leverage on transfer learning, using latent representations from models trained on large dataset, published previously.
Read more...NO
A computer-implemented method for screening ligand candidates for a target protein. This is done through an in-house developed, integrated ensemble machine learning (ML) model for predicting binding affinity with very high speed and precision.
Read more...Desmond by Schrödinger
Decrypting orphan GPCR drug discovery via multitask learning
W. C. Huang, W. T. Lin, M. S. Hung, J. C. Lee and C. W. Tung
J Cheminform 2024 Vol. 16 Issue 1 Pages 10
Accession Number: 38263092 PMCID: PMC10804799 DOI: 10.1186/s13321-024-00806-3
Read more...Microsoft packages may be used in reporting or publication.
We would like to start from a traditional ligand-based strategy like the QSAR model and combine it with structure-based ranking. QSAR model will be developed based on historical data of known hits. Enamine Real will be screened with the QSAR model. The hit molecules will be docked, and poses will be refined with ML-accelerated QM (ML force fields) calculations and traditional MD simulations. If time permits, we will also attempt binding free energy (BFE) calculations.
Read more...N/A
A combined ligand-, structure-, and interaction-based approach will be applied to identify novel ligands for MCHR1. In this combination, the structure-based component enables reaching out far beyond the training set, while available structure-activity relationships (SAR) data help keeping those false-positive-prone techniques in check.
Read more...None
Our approach is a ligand-based method where the most active compound will be used as a reference to find potential MCH antagonists. Our method partitions the reference structure in multiple fragments and compares each of the fragments against a library of the building blocks used to create the EnamineREAL library. The selection of the best building blocks is performed using the hydrophobic profile in 3D derived from QM-based descriptors.
Read more...exaScreen (https://pharmacelera.com/exascreen/)
Considering the availability of the two given sets of known MCHR1 binders we will employ our already developed ML-enforced ligand-based virtual screening tool named PyRMD. This tool implements the Random Matrix Discriminant (RMD) ML (Machine Learning) algorithm at its core which has been demonstrated to stand out for its denoising capabilities.
Read more...none
In approaching the challenge of predicting novel MCHR1 antagonists, given the absence of available crystal structures, a multi-faceted strategy is essential. The first hurdle is to establish an accurate model of MCHR1, which can be achieved through either homology modeling or utilizing the structure from AlphaFold. To ensure robustness, both methods would be employed, and the resulting structures compared to gauge their accuracy relative to other G protein-coupled receptors (GPCRs).
Read more...Schrodinger Maestro