We will employ a structure-guided drug discovery approach based on a unique molecular generative model (SAGE) recently developed in our lab. This generative model is specifically tuned to produce Enamine REAL Space ligands targeting the 3D structure of an input protein binding pocket. Compared to traditional virtual screening, this approach offers unparalleled speed, enabling us to rapidly sample from the entire enamine library space (30 billion compounds) in a structure-guided way. With this increased efficiency, we can generate ligands using multiple receptor conformations (rather than using a single docking grid as typically used in virtual screening).
Our approach consists of three steps: (1) structural modeling of the inactive receptor state (2) using our generative model to sample enamine ligands for each structure (3) scoring and ranking of generated ligands utilizing known actives (ComBind scoring).
We will generate multiple structural models of the MCHR1 receptor using two distinct approaches: AlphaFold 2 and homology modeling. For homology modeling, we will employ Schrodinger’s Prime software, utilizing inactive state structures from other GPCRs with the highest sequence similarity to MCHR1 as templates. Additionally, we will conduct restrained molecular dynamics (MD) simulations (Amber) to further expand our set of conformational models. Subsequently, the conformations obtained from these diverse methods will be clustered using the PENSA (Python Ensemble Analysis) library to yield approximately 10 unique receptor conformations, all in the inactive state.
Next, we will employ our generative model to generate 10,000 enamine ligands for each input receptor structure (100,000 molecules generated total). This model utilizes geometric deep learning, incorporating the atomic coordinates of the receptor pocket to directly position and join enamine building block fragments. The model takes as input a receptor pocket, with our selection encompassing the standard GPCR orthosteric site within the receptor core. We will ensure a variety of ligand sizes are sampled, with thresholds guided by known active ligands. We will remove any molecules that share a scaffold with known active ligands.
Finally, we will use a unique scoring approach, ComBind, to rank our generated ligands. ComBind utilizes a list of other ligands that are known to bind the same target (“helper ligands”); these ligands are docked in addition to the ligands of interest ("query ligand"). The scoring function rewards similar interactions (e.g. hydrogen bonds / salt-bridges) between the helper and query ligands. This interaction similarity score is combined with the standard docking score (Schrodinger’s Glide) produce a final ranking of the compounds. We will cluster the top 2000 molecules by chemical similarity (3D ECFP fingerprint) and greedily select the top 100 representative ligands from these clusters.