We will use our expertise in cheminformatics, structure-based drug design (SBDD), medicinal chemistry, and machine learning (ML) to generate hits for LRRK2. Using our in-house drug discovery & cheminformatics platform (published in scientific literature, proprietary code), we will identify a suitable subset of compounds from the Enamine Real Database using various filters which follow medicinal chemistry standards & CACHE white paper guidelines. To consider structurally-diverse compounds we will cluster this set using ECFP4 fingerprints. Between 50K-100K can be docked to 6DLO using our state-of-the-art docking program, which considers protein flexibility, displaceable water molecules, and ligand fit inside a protein's active site. Once the docking is completed we envision several avenues to pick compounds for visualization, as we have been exploring ways to combine physics-based and ML-based approaches to SBDD. The CACHE challenge will enable us to test multiple approaches and hypotheses simultaneously. We plan to test an updated scoring function that is built combining physics-based and ML using previously compiled datasets. In parallel, we will focus on implementing an ML algorithm to predict the docked scores of molecules (an approach proposed in recent literature). In this second approach we would consider 2-3 orders of magnitude more molecules and then prioritize high-ranking compounds for our comparatively resource-intensive docking algorithm. We will also select some molecules based on our ML-free physics-based SBDD approach to contrast with our ML-based selections. Ultimately, we aim to establish a pros/cons list of incorporating ML in physics-based SBDD approaches. Following each approach, essential interactions with LRRK2 and overall fit inside the central cavity will be assessed. 100 total top-ranking compounds yielded by the multiple approaches will be selected for testing, aimed at being evenly distributed across methods. "Computational negative controls" may also be selected to support our hypotheses. In line with the SGC/CACHE principles, we will document our research progress and publicize for all to follow and reference; we are taking a research-centered focus to this opportunity. We hope the sharing of our findings will help guide future efforts in SBDD. For the hit SAR stage, this workflow will change as follows: - we will search for analogues of the hits in the filtered set using the 2D analogue search module available in our platform. We will then undertake similar steps outlined above with the new, focused library.
Gromacs if MD is needed