학술논문

Machine Learning Approaches Identify Chemical Features for Stage-Specific Antimalarial Compounds.
Document Type
Academic Journal
Author
van Heerden A; Department of Biochemistry, Genetics and Microbiology, Institute for Sustainable Malaria Control, University of Pretoria, Private Bag X20, Hatfield 0028, South Africa.; Turon G; Ersilia Open Source Initiative, 28 Belgrave Road, Cambridge CB1 3DE, U.K.; Duran-Frigola M; Ersilia Open Source Initiative, 28 Belgrave Road, Cambridge CB1 3DE, U.K.; Pillay N; Department of Computer Science, University of Pretoria, Private Bag X20, Hatfield 0028, South Africa.; Birkholtz LM; Department of Biochemistry, Genetics and Microbiology, Institute for Sustainable Malaria Control, University of Pretoria, Private Bag X20, Hatfield 0028, South Africa.
Source
Publisher: American Chemical Society Country of Publication: United States NLM ID: 101691658 Publication Model: eCollection Cited Medium: Internet ISSN: 2470-1343 (Electronic) Linking ISSN: 24701343 NLM ISO Abbreviation: ACS Omega Subsets: PubMed not MEDLINE
Subject
Language
English
Abstract
Efficacy data from diverse chemical libraries, screened against the various stages of the malaria parasite Plasmodium falciparum , including asexual blood stage (ABS) parasites and transmissible gametocytes, serve as a valuable reservoir of information on the chemical space of compounds that are either active (or not) against the parasite. We postulated that this data can be mined to define chemical features associated with the sole ABS activity and/or those that provide additional life cycle activity profiles like gametocytocidal activity. Additionally, this information could provide chemical features associated with inactive compounds, which could eliminate any future unnecessary screening of similar chemical analogs. Therefore, we aimed to use machine learning to identify the chemical space associated with stage-specific antimalarial activity. We collected data from various chemical libraries that were screened against the asexual (126 374 compounds) and sexual (gametocyte) stages of the parasite (93 941 compounds), calculated the compounds' molecular fingerprints, and trained machine learning models to recognize stage-specific active and inactive compounds. We were able to build several models that predict compound activity against ABS and dual activity against ABS and gametocytes, with Support Vector Machines (SVM) showing superior abilities with high recall (90 and 66%) and low false-positive predictions (15 and 1%). This allowed the identification of chemical features enriched in active and inactive populations, an important outcome that could be mined for essential chemical features to streamline hit-to-lead optimization strategies of antimalarial candidates. The predictive capabilities of the models held true in diverse chemical spaces, indicating that the ML models are therefore robust and can serve as a prioritization tool to drive and guide phenotypic screening and medicinal chemistry programs.
Competing Interests: The authors declare no competing financial interest.
(© 2023 The Authors. Published by American Chemical Society.)