Computational methods

Hit Identification

Method type (check all that applies)

De novo design

Machine learning

Physics-based

Other (specify)

Crowd-sourced design

Description of your approach (min 200 and max 800 words)

Foldit is a crowd-sourced molecular biology game. The CACHE challenge will run as a series of regular puzzles in the Foldit platform and, if prior drug design puzzle experience is any indication, will see participation levels equivalent to other Foldit puzzles. For this challenge, the existing Foldit player base will use the graphical small molecule design tools to manually add atoms, bonds and fragments to a starting ligand with the binding pocket (derived from the crystal structures with starting fragments) to optimize the designed ligand for binding into the protein pocket

A library search tool can then be used by the players to find make-on-demand compounds which closely match their designed ligand. This library search using the “Small World” search API of the ZINC servers to find the closest molecules within the over 20 billion compounds in the full Enamine REAL space. The result compounds can then be filtered locally to ensure they match the allowed property ranges (green or amber) in the provided traffic light system. Players can then use cycles of manual editing and library search to navigate chemical space and find accessible library compounds which bind well to the protein.

Bonuses for additional objectives such as the specified objectives of the CACHE traffic light system (green range) will be encoded into the puzzle setup, and with the predicted binding energy will contribute to the score which the Foldit players will attempt to optimize. A strong bonus for compounds obtained from the library has been shown to greatly increase the number of library compounds which are submitted. Several rounds of design will be performed, slightly varying the system setup (e.g. differing starting structure, different starting locations differing objective weighting,) to get a variety of compounds.

The set of compounds thus designed by players will then be collected, and those compounds which are present in the specified library will be extracted. The potential to extract promising non-library compounds from the submitted results to find close library compounds also exists. Extracted library compounds will be evaluated with RDKit to filter out compounds falling in the red ranges of the traffic light metrics. Acceptable library compounds will then be redocked into the pocket by Foldit scientists using RosettaLigand to confirm their binding mode. The list of molecules will then be prioritized by binding mode recapitulation, docking score, conformational feasibility, and molecular properties. Orthogonal absolute binding energy prediction (e.g. BCL-AffinityNet/BCL-DockANNScore) will also be performed to assist in compound ranking.

While we will prioritize commercially available compounds, the design methodology allows for generation off-library de novo compounds. If such compounds appear to be clearly superior to on-library compounds, we do not rule out selecting such de novo compounds.

What makes your approach stand out from the community? (<100 words)

We use crowd-sourced design to generate the compounds. Free design followed by library search allows citizen scientists to freely navigate chemical space, yet keep their explorations close to purchasable compounds.

Method Name

Drugit

Free software packages used

Foldit/Rosetta/RDKit/ZINC API/BCL

Relevant publications of previous uses by your group of this software/method

Foldit Protein Design: https://doi.org/10.1038/s41586-019-1274-4

RosettaLigand ligand docking: https://doi.org/10.1371/journal.pone.0240450 https://doi.org/10.1371/journal.pone.0132508

BCL-AffinityNet/BCL-DockANNScore: https://doi.org/10.1021/acs.jcim.0c01001

Virtual screening of merged selections

Method type (check all that applies)

High-throughput docking

Machine learning

Physics-based

Other (specify)

Crowd-sourced docking

Description of your approach (min 200 and max 800 words)

Foldit players will be presented with puzzles which allow them replace the current ligand with one from the list of potential ligands. By iterating through the list of possible ligands and manually re-optimizing the binding conformation, players will be able to compete with each other to find those compounds which score best via the Rosetta energy function which is present in Foldit. From the player results, we will receive a variety of protein-ligand complexes. Active compounds should be the ones which score best.

As compounds scoring and ranking within the Foldit game will primarily be due to the Rosetta energy function, we will simultaneously carry out automated redocking of the ligand using RosettaLigand docking software. A comparison of player results and automated docking results will be done to determine the difference between the automated and crowd-sourced approaches. We hypothesize that while players may not be able to sample as broadly as automated methods, they will be able to better optimize the structure of the ligand within the pocket.

While it may not be incorporated into Foldit in time for the current challenge, our lab has also developed neural network-based approaches for predicting binding affinity from protein-ligand structure complexes (BCL-AffinityNet/BCL-DockANNScore), which we intend to eventually make accessible from within Foldit. This approach should give superior results to the existing Rosetta energy function for prediction of absolute binding energies. We will also use these networks to evaluate the affinity predictions of user generated protein-ligand complexes, and use those results to inform our rankings.

What makes your approach stand out from the community? (<100 words)

We employ crowd-sourced optimization of protein-ligand interactions.

Method Name

Drugit

Free software packages used

Foldit/Rosetta/RDKit/BCL

Relevant publications of previous uses by your group of this software/method

Foldit Protein Design: https://doi.org/10.1038/s41586-019-1274-4

RosettaLigand ligand docking: https://doi.org/10.1371/journal.pone.0240450 https://doi.org/10.1371/journal.pone.0132508

BCL-AffinityNet/BCL-DockANNScore: https://doi.org/10.1021/acs.jcim.0c01001

Challenge #3