A computer-implemented method for screening ligand candidates for a target protein. This is done through an in-house developed, integrated ensemble machine learning (ML) model for predicting binding affinity with very high speed and precision.
In the protein structure reconstruction strategy, the advanced deep learning model AlphaFold is employed as the initial step to predict the three-dimensional structure of MCHR1 based on its amino acid sequence from https://omim.org/entry/601751. Subsequently, a hybrid approach integrates AlphaFold's predictions with traditional homology modeling techniques, leveraging experimentally determined structures of homologous proteins to enhance the precision of the MCHR1 model. Model quality is rigorously evaluated using validation tools like PROCHECK and VERIFY3D, ensuring compliance with geometric and energetic standards. Through an iterative process, the hybrid model undergoes refinement via molecular dynamics simulations in Schrödingers Desmond, allowing the structure to reach equilibrium and attain a stable conformation. Structural analysis, including insights into dynamics and interactions, is conducted through molecular dynamics simulations. The final step involves validation against experimental data, if available, and utilizing the refined structure for functional annotation and further investigations, such as ligand-binding studies or contributions to drug discovery endeavors. The known binders will be run in a molecular simulation to validate the protein structure further. This step will be done with the help of our acedemic partner with experience in protein and homology modelling.
Afterwards, the input into the AI engine are SMILES from the REAL Enamine library under 550 in MW. The binding pocket features are analyzed and ligands capable of fitting into the target pocket are estimated according to matching between the features of the binding pocket and the ligand molecules. The compounds are filtered and screened with our proprietary filters for rapid screening with our machine learning techniques trained on the protein and ligands to skip conformational sampling. The remaining molecular candidates are ranked according to their predicted binding affinities, obtained using a novel ML-based scoring function (iScore) trained on ChemBL and pdb files curated manually. The initial screening for Compounds will then filter out compounds with cLogP > 5 with the proprietary ML trained iADMET module. We will even run our generative AI model iGen and similarity search the available ligand libraries for complementary candidate selection primarily in Enamines library.