Challenge #6

Hit Identification
Method type (check all that applies)
Deep learning
Machine learning
Physics-based
Description of your approach (min 200 and max 800 words)

Our modeling approach integrates advanced deep learning (DL) techniques with physics-based methods to enhance molecular docking accuracy and efficiency. We leverage the state-of-the-art DiffDock system, which treats molecular docking as a learning problem for predicting ligand poses. This approach utilizes a diffusion generative model (DGM), refined by projecting a diffusion from ambient space onto a submanifold, thus optimizing the training process on this refined manifold for better efficiency and accuracy.

In molecular docking, ligand movements are mapped to mathematical groups—translations are related to the 3D translation group T(3), rigid rotations to the 3D rotation group SO(3), and torsional changes to multiple copies of the 2D rotation group SO(2). These groups help formalize how translations, rotations, and torsional changes influence ligand poses, ensuring minimal structural disturbance.

The DiffDock framework includes two main models: the Score Model and the Confidence Model. The Score Model processes the ligand pose and protein structure, producing outputs for translations, rotations, and rotatable bonds. This model uses SE(3)-equivariant vectors for spatial movements and an SE(3)-invariant scalar for each rotatable bond. Our architectural approach employs SE(3)-equivariant convolutional networks, allowing for multiscale integration and efficient computational processing.

The Confidence Model, on the other hand, evaluates the plausibility and stability of the ligand pose relative to the protein, providing a single scalar output. This aids in refining docking predictions by assessing the joint rototranslations of the ligand and protein structures.

To validate ligand conformations predicted by DiffDock, we utilize Compass[1]—a system we developed to assess physical-chemical and bioactivity features of docked molecules. Compass integrates modules like PoseCheck and AA-Score to analyze steric clashes, strain energy, binding affinity, and the interaction fingerprint of the complex. These assessments help ensure the accuracy of our docking predictions by evaluating essential interaction dynamics and energy considerations within the protein-ligand complex.

PoseCheck evaluates the strain energy and steric clashes, crucial for determining the therapeutic effectiveness and physical plausibility of the binding poses. The AA-Score, an empirical scoring function, enriches this evaluation by quantifying amino acid-specific interactions, providing a detailed assessment of binding interactions, including hydrogen bonds, electrostatic and van der Waals forces, and other interaction types such as hydrophobic contacts and π-π stacking.

Overall, our approach integrates these advanced computational techniques to optimize molecular docking predictions, aiming to enhance the discovery and development of therapeutics through precise and efficient computational models.

 

What makes your approach stand out from the community? (<100 words)

Our approach uniquely stands out by integrating DiffDock with a physics-driven methodology, specifically through the use of diffusion generative models (DGMs) tailored for molecular docking. This integration allows for highly accurate predictions of ligand poses by learning distributions directly mapped to refined submanifolds, significantly enhancing both efficiency and precision. Furthermore, our model employs advanced SE(3)-equivariant convolutional networks that ensure robustness in handling spatial transformations. We validate our predictions with Compass, using tools like PoseCheck and the AA-Score to rigorously assess physical-chemical properties and bioactivity, ensuring unparalleled accuracy in docking.

Method Name
Compass
Commercial software packages used

Nan

Free software packages used

DiffDock, PoseCheck, AA-score, Openbabel, RDKit, py3dmol, biopandas, esm2, prolif, datamol, PyTorch, biopandas, e3nn, fair-esm, nvidia-cuda, prody, pybel, pytorch-lightning, torch-geometric

Relevant publications of previous uses by your group of this software/method

Ahmet Sarigun, Vedran Franke, Altuna Akalin. Compass: A Comprehensive Tool for Accurate and Efficient Molecular Docking in Inference and Fine-Tuning, arXiv preprint arXiv:2406.06841, 2024