학술논문
CGRdb2.0: A Python Database Management System for Molecules, Reactions, and Chemical Data.
Document Type
Academic Journal
Author
Gimadiev T; Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, 001-0021 Sapporo, Japan.; Nugmanov R; Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, 18, Kremlyovskaya str., 420008 Kazan, Russia.; Khakimova A; JSC ≪BIOCAD≫, Petrodvortsoviy District, Strelna, Svyazi st., Bld. 34, Liter A, 198515 St. Petersburg, Russia.; Fatykhova A; Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, 18, Kremlyovskaya str., 420008 Kazan, Russia.; Madzhidov T; Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, 18, Kremlyovskaya str., 420008 Kazan, Russia.; Sidorov P; Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, 001-0021 Sapporo, Japan.; Varnek A; Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, 001-0021 Sapporo, Japan.; Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 4, Blaise Pascal Str., 67081 Strasbourg, France.
Source
Publisher: American Chemical Society Country of Publication: United States NLM ID: 101230060 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1549-960X (Electronic) Linking ISSN: 15499596 NLM ISO Abbreviation: J Chem Inf Model Subsets: MEDLINE
Subject
Language
English
Abstract
This work introduces CGRdb2.0─an open-source database management system for molecules, reactions, and chemical data. CGRdb2.0 is a Python package connecting to a PostgreSQL database that enables native searches for molecules and reactions without complicated SQL syntax. The library provides out-of-the-box implementations for similarity and substructure searches for molecules, as well as similarity and substructure searches for reactions in two ways─based on reaction components and based on the Condensed Graph of Reaction approach, the latter significantly accelerating the performance. In benchmarking studies with the RDKit database cartridge, we demonstrate that CGRdb2.0 performs searches faster for smaller data sets, while allowing for interactive access to the retrieved data.