학술논문

Modeling the expansion of virtual screening libraries.
Document Type
article
Source
Nature Chemical Biology. 19(6)
Subject
Libraries
Digital
Molecular Docking Simulation
Language
Abstract
Recently, tangible virtual libraries have made billions of molecules readily available. Prioritizing these molecules for synthesis and testing demands computational approaches, such as docking. Their success may depend on library diversity, their similarity to bio-like molecules and how receptor fit and artifacts change with library size. We compared a library of 3 million in-stock molecules with billion-plus tangible libraries. The bias toward bio-like molecules in the tangible library decreases 19,000-fold versus those in-stock. Similarly, thousands of high-ranking molecules, including experimental actives, from five ultra-large-library docking campaigns are also dissimilar to bio-like molecules. Meanwhile, better-fitting molecules are found as the library grows, with the score improving log-linearly with library size. Finally, as library size increases, so too do rare molecules that rank artifactually well. Although the nature of these artifacts changes from target to target, the expectation of their occurrence does not, and simple strategies can minimize their impact.