The proposed framework utilizes a lightweight convolutional neural network architecture highly optimized for handling low-level molecular characteristics capable of discovering novel chemotypes while screening one billion commercial, synthesizable libraries (e.g. ENAMINE Real) per day for both specific and multi-targeted hits identification. The successful validation is already done on one target. We are currently also evaluating the performance of the framework on multiple other targets in collaboration with different partners.
For the CACHE #3 challenge, we are preparing to screen the current (November 2022, 1st week) ENAMINE Real Database with over 5.5 billion molecules.