Computational methods

Hit Identification

Method type (check all that applies)

Deep learning

High-throughput docking

Machine learning

Physics-based

Hybrid of the above

Combination of the above.

Description of your approach (min 200 and max 800 words)

AlphaFold with MSA subsampling and possibly alternative models will be used to generate an ensemble of protein conformations. Selected conformations will be simulated with molecular dynamics to relax the system. Simulations will be analyzed to identify druggable pockets with some consistency to published data (e.g. known inhibitors dock well). A limited selection of protein conformers and identified binding sites from this analysis will be used as the targets of a high throughput docking protocol. To effectively screen ENAMINE in the given time frame, a “deep docking” approach will be used where a surrogate model of docking scores is iteratively trained to select compounds for docking. Docking will be performed using GNINA with its default parameters, which includes convolutional neural network scoring functions. The final hits will be filtered for solubility (predicted using a DL model). Compounds will be clustered by Tanimoto fingerprint similarity and the top 3 (according to GNINA’s CNN_VS score) of each cluster will be selected for evaluation. If some selected compounds are significantly more expensive than others they may be skipped in favor of purchasing more potential hits. All curation and selection of the final hit list will be determined using docking scores, computational filters, and cost considerations with no human expert selection.

What makes your approach stand out from the community? (<100 words)

In CACHE 1 and 2 we used similar methods with good result. By continuing to apply the same method (and by the same people) we provide a continuity of evaluation across the different targets of CACHE. Can we continue the trend even in the face of a target that is not structurally enabled?

Method Name

GNINA FTW

Commercial software packages used

None.

Free software packages used

GNINA, AMBER (partially free), OpenFold/LocalColabFold

Relevant publications of previous uses by your group of this software/method

GNINA 1.0: Molecular docking with deep learning (Primary application citation)

A McNutt, P Francoeur, R Aggarwal, T Masuda, R Meli, M Ragoza, J Sunseri, DR Koes. J. Cheminformatics, 2021

link PubMed ChemRxiv

Challenge #5