Computational methods

Hit Identification

Method type (check all that applies)

High-throughput docking

Physics-based

Hybrid of the above

Use of docking programs that use in part classical physics.

Description of your approach (min 200 and max 800 words)

We will use structure-based ultra-large virtual screenings using VirtualFlow. Step 1: Protein preparation Protein structures will be prepared with Maestro from Schrödinger (protonation state assignment, assignment of missing atoms/side chains, hydrogen atoms, ...). MD simulations of the target protein will be carried out using Amber 18. Conformations will be clustered, and representative structures of the clusters will be used for the virtual screens. Step 2: Hit identification The hit identification step will consist of two virtual screening stages. - Stage 1: We will use structure-based ultra-large virtual screenings using physics-based docking methods AutoDock Vina, QuickVina, Smina, PLANTS as well as several deep learning based docking programs. We will screen a ligand library with over 20 billion molecules with VirtuaFlow, and open-source platform for ultra-large virtual screens. The libraries we are using are the Enamine REAL Space library. We will use a new adaptive screening technique that we have developed. Due to the large-scale computations required for this approach, we will use the supercomputer of our university, as well as the cloud if additional computation time is required. We have extensive experience using the cloud and have used over 1 million CPUs in parallel in the past. The protein will be held rigid in stage 1 of the screen. The ligand libraries which we will be using are the libraries part of the VirtualFlow project. These libraries contain ligands that have been protonated, tautomerized, the 3D conformation has been computed, and the ligands are the ready-to-dock PDBQT format. - Stage 2: We will rescreen the top 1 million compounds of stage 1 in stage 2, and will allow the protein side chains at the binding site to be flexible. Multiple protein backbone conformations might be used to carry out ensemble dockings in addition, based on the results of the MD simulations in the protein preparation step (see section above). Step 3: Hit optimization In the second round of the challenge (hit optimization), we will search the chemical space (our available libraries) for the most similar analogs, and screen them with again VirtualFlow as described above in step 2. Step 4: Postprocessing of the results The screened compounds will be ranked by their docking score. Of the top 1000 compounds, biophysical and pharmacokinetic properties will be computed. Compounds with unfavorable properties (e.g. too high logP) will be filtered out. In the lead optimization phase, we will search the chemical space for the most similar analogs, and screen them again with VirtualFlow.

We have experience with SARS-CoV-2 (see also the references).

What makes your approach stand out from the community? (<100 words)

The ultra-large virtual screens that we do are of the largest scales reported to date. In 2020 we reported one of the first screens with over 1 billion compounds. In this work, we plan to screen over 20 billion compounds. The scale of the ligand library screened directly correlates with the potency and the true hit rate observed during experimental validation (Gorgulla 2020, Lyu 2019, Alon 2021).

Method Name

VirtualFlow/Ultra-Large Virtual Screens

Commercial software packages used

Maestro (protein preparation)

Free software packages used

VirtualFlow, AutoDock Vina, QuickVina, Smina, Plants, GWOVina

Relevant publications of previous uses by your group of this software/method

Gorgulla, Christoph, et al. "An open-source drug discovery platform enables ultra-large virtual screens." Nature 580.7805 (2020): 663-668. https://www.nature.com/articles/s41586-020-2117-z

Gorgulla, Christoph, et al. "A multi-pronged approach targeting SARS-CoV-2 proteins using ultra-large virtual screening." Iscience 24.2 (2021): 102021. https://www.sciencedirect.com/science/article/pii/S2589004220312189

Gorgulla, Christoph, et al. "Accounting of receptor flexibility in ultra-large virtual screens with VirtualFlow using a grey wolf optimization method." Supercomputing frontiers and innovations 7.3 (2020): 4. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8530406/

Gorgulla, Christoph, et al. "VirtualFlow Ants—Ultra-Large Virtual Screenings with Artificial Intelligence Driven Docking Algorithm Based on Ant Colony Optimization." International Journal of Molecular Sciences 22.11 (2021): 5807. https://www.mdpi.com/1422-0067/22/11/5807

Challenge #3