Computational methods

Hit Identification

Method type (check all that applies)

De novo design

Deep learning

Machine learning

Physics-based

Hybrid of the above

We employ both generative deep learning/reinforcement learning approaches and physics based docking and scoring to nominate hits from the Enamine library

Description of your approach (min 200 and max 800 words)

Our approach combines the expertise of Kozakov Lab at Stony Brook and Tropsha Lab at UNC. Our workflow uses several complimentary modules for identification of high affinity hits for a given protein target with a known 3D structure. Identification of binding site hot-spot information together with conventional structure-based virtual screening methods are enabling components of our hit selection approach. We use FTMap, a computational mapping algorithm that identifies binding regions on the surface of the target protein with major contributions to the ligand binding free energy along with soaked fragments' data from Fragalysis (when available). FTMap samples all possible positions of small organic molecule probes and scores them using a physical energy function. The binding site regions that bind multiple probes (both computational and experimental) are considered hot spots that bind corresponding favorable chemical functional groups. In addition to the hypothesis-free FTMap, we use a different approach towards hot spot identification, LigTBM, that was inspired by the binding site similarity search methods across the structural proteome. The basic idea is to match physico-chemical environment of the target protein to the micro pockets containing FTMap style small organic molecule probes extracted from PDB structures containing bound ligands. This matching procedure also provides us with possible fragment placement within the target protein, so the data will be presented in the same form as FTMap data, which will facilitate their comparative analysis and identification of consensus hot spots. The hot spot information is used to create a pharmacophore model for the next stage of virtual screening, where we perform a pharmacophore-based virtual screening of the entire Enamine REAL library (~40B with tautomers) to select a subset of the target specific compounds based on their fitting to our pharmacophore hypothesis. In addition, we use the pharmacophore models to bias our deep and reinforcement learning method termed ReLeaSE to generate target-specific novel hit compounds. The combined set including both the virtual screening hits and de novo generated hit compounds (usually ~1M) is then docked into the binding site using Glide by Schrödinger. The top scored docking hits then will be additionally prioritized using the hot spot information. These consensus hits will be nominated for the experimental testing.

What makes your approach stand out from the community? (<100 words)

We employ predominatly methodologies and software developed within our groups. Unique features of our approach reside with our use of both experimental (Fragalysis) and computionally derived (FTMap and LigTBM) identification of the hot spots, which are used to formulate the pharmacophore hypotheses. Additional unique element is the use of generative and reinforcement learning (with the bias provided by pharmacophore hypotheses) to design coputational hits de novo and then nominate both these hits and Enamine compounds similar to these hits as candidates for the experimental testing. The latter strategy is especially useful in the next phase of hit optimization.

Method Name

Frag2Hits

Commercial software packages used

Glide, to additionally veryfy nominated hits

Free software packages used

Autodock, FTMap, LigTBM, ReLeaSE

Relevant publications of previous uses by your group of this software/method

Kozakov D, Grove LE, Hall DR, Bohnuud T, Mottarella SE, Luo L, Xia B, Beglov D, Vajda S. The FTMap family of web servers for determining and characterizing ligand-binding hot spots of proteins. Nature Protocols. 2015

Popova M, Isayev O, Tropsha A.* Deep reinforcement learning for de novo drug design. Sci Adv. 2018 Jul 25;4(7):eaap7885. doi: 10.1126/sciadv.aap7885;

Alekseenko, A.; Kotelnikov, S.; Ignatov, M.; Egbert, M.; Kholodov, Y.; Vajda, S.; Kozakov, D. ClusPro LigTBM: Automated Template-Based Small Molecule Docking. J. Mol. Biol. 201

Korshunova M, Ginsburg B, Tropsha A, Isayev O. OpenChem: A Deep Learning Toolkit for Computational Chemistry and Drug Design. J Chem Inf Model. 2021 Jan 25;61(1):7-13. doi: 10.1021/acs.jcim.0c009719. https://doi.org/10.1016/j.jmb.2019.12.011;

Challenge #2