Foldit is a crowd-sourced molecular biology game. The CACHE challenge will run as a series of regular puzzles in the Foldit platform and, if prior drug design puzzle experience is any indication, will see participation levels equivalent to other Foldit puzzles. For this challenge, the existing Foldit player base will use the graphical small molecule design tools to manually add atoms, bonds and fragments to a starting ligand with the binding pocket (derived from the crystal structures with starting fragments) to optimize the designed ligand for binding into the protein pocket
A library search tool can then be used by the players to find make-on-demand compounds which closely match their designed ligand. This library search using the “Small World” search API of the ZINC servers to find the closest molecules within the over 20 billion compounds in the full Enamine REAL space. The result compounds can then be filtered locally to ensure they match the allowed property ranges (green or amber) in the provided traffic light system. Players can then use cycles of manual editing and library search to navigate chemical space and find accessible library compounds which bind well to the protein.
Bonuses for additional objectives such as the specified objectives of the CACHE traffic light system (green range) will be encoded into the puzzle setup, and with the predicted binding energy will contribute to the score which the Foldit players will attempt to optimize. A strong bonus for compounds obtained from the library has been shown to greatly increase the number of library compounds which are submitted. Several rounds of design will be performed, slightly varying the system setup (e.g. differing starting structure, different starting locations differing objective weighting,) to get a variety of compounds.
The set of compounds thus designed by players will then be collected, and those compounds which are present in the specified library will be extracted. The potential to extract promising non-library compounds from the submitted results to find close library compounds also exists. Extracted library compounds will be evaluated with RDKit to filter out compounds falling in the red ranges of the traffic light metrics. Acceptable library compounds will then be redocked into the pocket by Foldit scientists using RosettaLigand to confirm their binding mode. The list of molecules will then be prioritized by binding mode recapitulation, docking score, conformational feasibility, and molecular properties. Orthogonal absolute binding energy prediction (e.g. BCL-AffinityNet/BCL-DockANNScore) will also be performed to assist in compound ranking.
While we will prioritize commercially available compounds, the design methodology allows for generation off-library de novo compounds. If such compounds appear to be clearly superior to on-library compounds, we do not rule out selecting such de novo compounds.