Computational methods

Hit Identification

Method type (check all that applies)

Deep learning

High-throughput docking

Description of your approach (min 200 and max 800 words)

We intend to employ KarmaDock, an innovative deep learning (DL) paradigm for ligand docking developed in-house, to carry out a hierarchical virtual screening of the Enamine database. The initial strategy consists of the following steps:

The protein will be prepared by utilizing the Protein Preparation Wizard in the Schrödinger suite to assign bond orders and add hydrogens. Subsequently, the protein structure will be amended by establishing necessary bonds to proximal sulfurs, supplementing missing side chains and loops, and optimizing hydrogen bonding networks using the OPLS3 force field. The protonation states of residues will be generated with PROPKA at a pH of 7.0, whereas the ionized states of heteroatoms will be determined using Epik.
Conduct a docking-based screening utilizing KarmaDock on the previously mentioned molecules, selecting and retaining the ones with the highest scores.
Rescoring the chosen molecules, which were initially scored by KarmaDock, using the classical Glide SP method incorporated in the Schrödinger software. Retain the molecules with the lowest scores for further analysis.
Implement various negative design strategies (e.g., drug-like filters, PANIS rules, and REOS rules) to conduct preliminary screening and exclude undesirable molecules.
Employ the ADMET module in the Schrödinger suite to assess the ADMET properties of the remaining compounds and remove molecules exhibiting poor ADMET properties.
Cluster the remaining molecules, scrutinize their binding modes, and select a subset with higher KarmaDock score, lower Glide SP score, reasonable ADMET properties and favorable interaction modes for procurement.

What makes your approach stand out from the community? (<100 words)

Unlike other methods, KarmaDock was pretrained with a mixture density network to introduce a distance inductive bias to the shared encoders, thereby helping to guide the learning of pose generation. Validated on three benchmark datasets, it is 130 times faster than LeDock and exhibits higher docking success rates (89.1% vs. 82.5% with LigPose) and scoring accuracy (BEDROC: 0.519 vs. 0.378 with Glide@SP). Applied in a virtual screening project, it successfully identified experiment-validated LTK inhibitors. Due to its remarkable performance, KarmaDock is well-suited for large-scale virtual screening. Its manuscript is currently under review by Nature Computational Science.

Method Name

KarmaDock

Commercial software packages used

Schrödinger

Free software packages used

AutoDock Tools, OpenBabel, RDkit, PyTorch, MDAnalysis, pytorch_geometric, Prody.

Relevant publications of previous uses by your group of this software/method

None

Challenge #4