학술논문

PDBe CCDUtils: an RDKit-based toolkit for handling and analysing small molecules in the Protein Data Bank
Document Type
Original Paper
Source
Journal of Cheminformatics. 15(1)
Subject
PDB
RDKit
Ligand
Protein structure
Python
CCD
Covalently Linked Components
CLC
PRD
BIRD
PDBx/mmCIF
Language
English
ISSN
1758-2946
Abstract
While the Protein Data Bank (PDB) contains a wealth of structural information on ligands bound to macromolecules, their analysis can be challenging due to the large amount and diversity of data. Here, we present PDBe CCDUtils, a versatile toolkit for processing and analysing small molecules from the PDB in PDBx/mmCIF format. PDBe CCDUtils provides streamlined access to all the metadata for small molecules in the PDB and offers a set of convenient methods to compute various properties using RDKit, such as 2D depictions, 3D conformers, physicochemical properties, scaffolds, common fragments, and cross-references to small molecule databases using UniChem. The toolkit also provides methods for identifying all the covalently attached chemical components in a macromolecular structure and calculating similarity among small molecules. By providing a broad range of functionality, PDBe CCDUtils caters to the needs of researchers in cheminformatics, structural biology, bioinformatics and computational chemistry.Graphical Abstract: