Computational methods

Hit Identification

Method type (check all that applies)

Deep learning

High-throughput docking

Machine learning

Description of your approach (min 200 and max 800 words)

We will build on an analysis pipeline we have developed that is capable of searching billion-scale small-molecule libraries for binding candidates to a target pocket. In the first phase of the pipeline, we will perform a fast (and approximate) affinity prediction using a strategy based on graph neural networks (GNNs). We have developed GNNs that compute representations of both ligand and protein pocket based on a diverse collection of surface properties. The representations produced by these models are combined using a multi-layer perceptron that performs docking-free integration to produce rapid predictions of binding affinity for each of billions of ligands to the target pocket. Fine-tuning training of this integration model will be performed based on ADP-bound pockets in pdbbind. This process will produce a down-sampled data set that is enriched for molecules with good binding affinity.

The ligands in this down-sampled library (~1M ligand candidates) will be docked to the target pocket using AutoDock Vina, and candidate poses will be subjected to a rescoring neural network that predicts affinity based on a variant of the above integration network that incorporates pose-informed interaction features. The top 1000 candidates from this analysis phase will be subjected to coarse-grained MD (CGMD, with the MARTINI package), with the top 100 candidates selected for submission.

What makes your approach stand out from the community? (<100 words)

The unique feature of our approach is the development of a graph neural network trained to establish representations of ligand and protein pocket, such that affinity can be produced by a fast deep network that predicts affinity based on those representations. Our tests indicate that our model performs well in the task of predicting interactions. Another novel strategy is the use of a multi-stage pipeline consisting of fast approximate affinity prediction, followed by slower docking-based prediction, then CGMD.

Method Name

PocketPackerPicker

Commercial software packages used

N/A

Free software packages used

Autodock Vina

PLIP

MARTINI

Relevant publications of previous uses by your group of this software/method

(not listing, in honor of anonymity for review)

Challenge #3