Our team of Computational Chemists and Machine Learning experts is part of a Science CRO that has continuous impact on the drug discovery community by collaborating with big pharma and incubating biotechs. In drug discovery projects, we prioritize compounds based on our million-scale in-house compound database, which includes structures, bioactivities, and PhysChem data. Additionally, we use validated machine learning models to predict PhysChem properties that were trained on our legacy database.
The fragment binding modes to NSP13 in the X-ray structures will be summarized in a merged binding site model. The adjacent, so far unexploited subpockets will be investigated further with the help of hydration site analysis tools, such as 3D-RISM and PyRod. These tools will allow us to identify hotspots to gain enthalpy and entropy terms upon ligand binding. We believe the information of fragment interactions as well as the energetics and pharmacophoric characteristics of hydration sites will yield an augmented binding site model. This model will be detrimental to find compounds that extend from the fragments towards unexploited subpockets.
The binding site model will be used in a pharmacophoric screen of the virtual compound space including iterative optimization. Hits will be filtered in accordance with the traffic light system of CACHE (fraction sp3 carbons, polar surface area, logD, solubility, number of rotatable bonds, and molecular weight). For optimal property prediction accuracy our machine learning models trained on proprietary in-house data will be applied.
The filtered hitlist will be docked to a prepared 3D structure model of the target NSP13 (multiple conformations, if required) for confirmation, including optimal shape fit to the pocket and ligand strain. Depending on the number of entries the following work packages might be applied in an iterative fashion.
Docking hits will be triaged using two docking scores implemented in ICM. A classical score including a desolvation term and a machine learning score trained on protein-ligand interactions (Radial ant Topological Convolutional Neural Net - RTCNN).
Docking hit poses will undergo physics-based method for free energy of binding estimation, such as Molecular Mechanics Poisson-Boltzmann surface area continuum solvation (MM-PBSA) on the docking pose or a short molecular dynamics simulation. Protein-ligand interaction fingerprints might help preselection for visual inspection to assure binding pose plausibility.
Optionally, the hitlist will be expanded by rescreening with a revised binding site model to enrich the hitlist with favored scaffolds, which will undergo the same workflow including docking and subsequent steps, if required.
Eventually, the hitlist will be revised by a senior medicinal chemist for lead-likeness and medchem attractiveness. Hit confirmation, revision and hit expansion might be repeated until a list of up to 100 promising virtual molecules is obtained.