In the hit identification phase, we plan to deploy a hybrid strategy combining the experience of medicinal chemists with EquiScore. EquiScore is a generic protein-ligand interaction prediction model based on geometric deep learning developed by our team. When designing the model, we thoroughly considered prior information from different sources, including chemical prior information, interaction prior information, spatial prior information, et. We integrated them into a deep learning framework to integrate multiple sources of information to characterize protein-ligand interactions in geometric space. In addition, we considered various potential problems in constructing protein-ligand prediction datasets and proposed several targeted data enhancement strategies so that the model can further extract representations that can be generalized to new targets and ultimately improve the model Screening capabilities on novel targets. In a large-scale retrospective benchmark test, EquiScore's screening ability surpassed the traditional scoring function GLIDE SP, a series of classic machine learning scoring functions, and the newly published deep learning scoring functions DeepDock, RTMScore, PIGNet, TANKBind, etc., and shows the best generalization performance on new targets. Within the team, we cooperated with the experimental department to carry out prospective experimental verification and successfully screened active small molecules of the target.
We will use the trained EquiScore model to screen active compounds in this competition. Specifically, we first use software to generate putative binding poses for all molecules in the database. Second, we will use EquiScore to score these poses and sort the generated pose. Finally, we will cluster the top-scoring molecules and select candidate compounds.