Recent advances in AI/ML coupled with Biolexis's internally developed, fully automated MolecuLern Workflow helping this shift, accelerating, and improving R&D activities of our company's pipeline and some of our collaborator's programs. Recent advances in high computing, the availability of proven computational algorithms, large, validated data set ML training models, and deep neural networks stemmed an exceptional speed in the field of drug discovery and development. Biolexis MolecuLern platform reduces the timeline and developing costs while aiming for a greater probability of success targeting the CBLB TKB domain for novel leads under this CACHE challenge #4 with the following approaches:
1. MolecuLern is a proprietary structure-embedded platform that uses high-quality half-a-million biochemical lab data (empirical IC50s, Ki, and Kd) from the training set against the vast chemical space of the chosen virtual library and identifies pre-hits with predicted IC50s, Ki, and Kd.
2. Our high-quality wet lab/empirical database to pre-hits data predictions assists further in developing newer AI/ML predictive models against a given CBLB TKB domain within our hit-finding discovery processes.
3. MolecuLern further uses biochemical/mutational data specific to sensitive AAs within defined active site pockets by considering crystallographic to physiological conformational state of hot spots and computes pocket energetics through local dynamic simulations ready for large-scale virtual screening.
4. The platform includes Physics-based relative binding free energy (RBFE) plus MM/GBSA predicted free energy of binding (FEB) of the hits as the penultimate step in the hit selection process.
5. Hybrid of MM/GBSA/Machine Learning & MolecuLern simultaneously predicts PhysicoChem, ADME-Toxicity, in vivo PK, %F and validates the developability candidate's criteria in advance for an easy selection and nomination of candidate-ready at the in vitro stage of screening.
6. MolecuLern significantly reduces the Hit2-Lead optimization timeline due to the embedded empirical IC50s, Ki, and Kd data from the training library against the big library of over 5.0 plus billion libraries.