Computational methods

Hit Identification

Method type (check all that applies)

Machine learning

Physics-based

Hybrid of the above

Protein modelling and Machine learning screening

Description of your approach (min 200 and max 800 words)

A computer-implemented method for screening ligand candidates for a target protein. This is done through an in-house developed, integrated ensemble machine learning (ML) model for predicting binding affinity with very high speed and precision.

In the protein structure reconstruction strategy, the advanced deep learning model AlphaFold is employed as the initial step to predict the three-dimensional structure of MCHR1 based on its amino acid sequence from https://omim.org/entry/601751. Subsequently, a hybrid approach integrates AlphaFold's predictions with traditional homology modeling techniques, leveraging experimentally determined structures of homologous proteins to enhance the precision of the MCHR1 model. Model quality is rigorously evaluated using validation tools like PROCHECK and VERIFY3D, ensuring compliance with geometric and energetic standards. Through an iterative process, the hybrid model undergoes refinement via molecular dynamics simulations in Schrödingers Desmond, allowing the structure to reach equilibrium and attain a stable conformation. Structural analysis, including insights into dynamics and interactions, is conducted through molecular dynamics simulations. The final step involves validation against experimental data, if available, and utilizing the refined structure for functional annotation and further investigations, such as ligand-binding studies or contributions to drug discovery endeavors. The known binders will be run in a molecular simulation to validate the protein structure further. This step will be done with the help of our acedemic partner with experience in protein and homology modelling.

Afterwards, the input into the AI engine are SMILES from the REAL Enamine library under 550 in MW. The binding pocket features are analyzed and ligands capable of fitting into the target pocket are estimated according to matching between the features of the binding pocket and the ligand molecules. The compounds are filtered and screened with our proprietary filters for rapid screening with our machine learning techniques trained on the protein and ligands to skip conformational sampling. The remaining molecular candidates are ranked according to their predicted binding affinities, obtained using a novel ML-based scoring function (iScore) trained on ChemBL and pdb files curated manually. The initial screening for Compounds will then filter out compounds with cLogP > 5 with the proprietary ML trained iADMET module. We will even run our generative AI model iGen and similarity search the available ligand libraries for complementary candidate selection primarily in Enamines library.

What makes your approach stand out from the community? (<100 words)

Avoiding conformational sampling speeds up the hit identification process considerably as well as produces some of the most accurate affinity predictions to date. In CASF-2016 and CSAR benchmarks and case studies, our tool consistently performs best in scoring power, ranking power, and screening power. With our novel Ultra-Fast Screening approach (UFS), we can furthermore screen compounds several orders of magnitude faster than any current software we came across.

The research team at Gothenburg university that will aid in protein modelling has many years of molecular modelling experience, and are at the forefront of current drug discovery trends, lecturing on topics in physical and molecular chemistry at university.

Method Name

protein modeling and iScore

Commercial software packages used

Desmond by Schrödinger

Free software packages used

F-Pocket, D-Pocket, RDKit, AlphaFold2, RoseTTAFold, i-TASSER, YASARA

Relevant publications of previous uses by your group of this software/method

The team behind the project has incorporated and became ANYO Labs AB in December 2022 and the method subject to patent filing in Febraury 2023. Because of this, the team has kept the methodology a trade secret with the intention to publish in 2024. Professor Leif Eriksson has hundreds of publications in molecular modelling and molecular chemistry otherwise found below:

https://www.researchgate.net/scientific-contributions/Leif-A-Eriksson-38798085

Challenge #5