Computational methods

Hit Identification

Method type (check all that applies)

High-throughput docking

Physics-based

Hybrid of the above

Combination of cheminformatics, pharmacophore modeling, molecular dynamics, high-throughput docking

Description of your approach (min 200 and max 800 words)

We will use our expertise in cheminformatics, molecular dynamics (MD), structure-based drug design (SBDD), and medicinal chemistry to generate hits for the tyrosine kinase binding (TKB) domain of the Cbl Proto-Oncogene B (CBL-B).

In a first step, we will use the set of patented TKB CBL-B inhibitors provided by CACHE along with the crystal structure of human CBL-B (PDB ID 8GCY) [1] to benchmark our state-of-the-art docking program (published in scientific literature, proprietary code).[2] These compounds block the phosphorylation of Y363 and act as intramolecular glue, keeping CBL-B in a closed/inactive conformational state.[3]

Whereas our benchmarking study will act primarily as a sanity check, it will also inform the development of one (or more) pharmacophoric model(s). These model(s) will incorporate both ligand (i.e., bioactive conformation of co-crystallized inhibitor in 8GCY) and structural data (interactions with Y363, binding site volume, exclusion spheres, etc.). The pharmacophore model(s) will be developed with our proprietary drug discovery software (published in scientific literature, proprietary code) [4].

In a second step, using our drug discovery platform we will curate a library of commercially available compounds (Enamine REAL database) to ultimately screen using our pharmacophoric models. The curation will be informed by a detailed structure-activity relationship (SAR) analysis on the set of patented inhibitors provided by the CACHE committee, key molecular features (scaffolds, functional groups), and physico-chemical descriptors (i.e., number of hydrogen bond donors/acceptors, logP, etc.).

Then, using our drug discovery platform and the curated library, we will identify molecules that are dissimilar to the patented compounds (Tanimoto coefficients lower than a pre-defined threshold, ECFP4 fingerprints). To further ensure chemical and structural diversity, the key scaffolds identified by the SAR analysis will be replaced by bioisosteres (sourced from the literature).

This library will be further curated by employing synthetic feasibility constraints (compounds in-stock or easy/medium synthesis), common medicinal chemistry guidelines (Lipinski Ro5, Lilly med chem rules, removal of PAINS compounds), as well as principles outlined in the CACHE white paper.

We will screen the ligand library against the pharmacophore model(s). Only compounds with a root mean square deviation (RMSD) under a certain threshold (i.e., < 1.0Å) to the model(s) will be kept. The compounds that pass the initial pharmacophoric filtering will then be virtually screened with our docking program against the published crystal structure of CBL-B. We envision docking at most 100,000 compounds. If the number of compounds passing the pharmacophoric filtering is > 100,000, we will cluster them to ensure diversity.

Once the docking is completed, we will use our protein-ligand analyzer algorithm (proprietary code, unpublished) to highlight compounds predicted to engage Y363 through various interaction types (π-π stacking, cation-π, weak hydrogen bonds, hydrophobic contacts, …). We will then extract these compounds for further analysis. To sample a diverse list from the extracted compounds we will cluster them (either by similarity or shape) and employ different statistical methods (developed in-house) to potentially reduce the number of false positives. We will select up to 200 diverse molecules from this step for follow-up calculations.

These compounds will be subjected to short (~30-50 ns) MD simulations using GROMACS [5] and our supercomputing capabilities. This approach has been shown to improve the docking predictions and reduce the potential for selection of false positives.[6] From every simulation we will obtain binding free energies through the MM/PBSA protocol.[7] Moreover we will perform a per-residue energy decomposition and identify those compounds that interact strongest with Y363 (π-stacking, cation-π, etc.). These two key pieces of data will inform our selection of the final list of compounds.

Ending note: in line with the SGC/CACHE principles, we will document our research progress and publicize for all to follow and reference; we are taking a research-centered focus to this opportunity. We hope the sharing of our findings will help guide future efforts in SBDD.

References:

https://www.rcsb.org/structure/8GCY

Moitessier, N. et al., Acc. Chem. Res., 2016, 49(9), 1646-1657.

https://www.nurixtx.com/wp-content/uploads/2022/09/Nurix-CBL-B-DOT-Talk_JK.pdf

Therrien, E. et al., J. Chem. Inf. Model., 2012, 52(1), 210-224.

Abraham, M.J. et al., SoftwareX, 2015, 1, 19-25.

Gutteres, H. and Im, W., J. Chem. Inf. Model., 2020, 60(4), 2189-2198.

Valdes-Tresanco, M.S. et al., J. Chem. Theory Comput., 2021, 17(10), 6281-6291.

What makes your approach stand out from the community? (<100 words)

We are uniquely positioned to reach the objective set out at the onset of this challenge. Our team of medicinal and organic chemists, structural biologists, and computational scientists has a breadth of experience and knowledge that has been applied successfully to similar drug discovery projects. We foster a collaborative approach that provides valuable insight to such projects. Moreover, we have supercomputing capabilities that allow us to tackle problems from different perspectives (i.e., microsecond molecular dynamics, high-throughput docking, large-scale cheminformatics on libraries of 100M+ compounds, ...).

Method Name

Hybrid

Commercial software packages used

proprietary drug discovery software

Free software packages used

GROMACS

Relevant publications of previous uses by your group of this software/method

Moitessier, N. et al., Acc. Chem. Res., 2016, 49(9), 1646-1657.

Therrien, E. et al., J. Chem. Inf. Model., 2012, 52(1), 210-224.

Labarre, A. et al., J. Chem. Inf. Model., 2022, 62(4), 1061-1077.

Burai-Patrascu, M. et al., ChemRxiv, 2022, https://doi.org/10.26434/chemrxiv-2022-ncqsj-v2

Nivedha, A.K. et al., ChemRxiv, 2023, https://doi.org/10.26434/chemrxiv-2023-5g38r

Hit Optimization Methods

Method type (check all that applies)

High-throughput docking

Physics-based

Hybrid of the above

Combination of cheminformatics, pharmacophore modeling, molecular dynamics, high-throughput docking

Description of your approach (min 200 and max 800 words)

For the hit SAR stage, this workflow will change as follows:

- we will search for analogues of the hits in the Enamine REAL database using the 2D analogue search module available in our platform. We will then undertake similar steps outlined above with the new, focused library.

Challenge #4