학술논문

REACTCLASS: Cross-Modal Supervision for Subword-Guided Reactant Entity Classification
Document Type
Conference
Source
2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) Bioinformatics and Biomedicine (BIBM), 2022 IEEE International Conference on. :844-847 Dec, 2022
Subject
Bioengineering
Computing and Processing
Signal Processing and Analysis
Correlation
Chemistry
Annotations
Training data
Bioinformatics
Chemicals
Chemistry Text Mining
Cross-Modal Supervised Learning
Attention Map Representation
Language
Abstract
We propose REACTCLASS that automatically maps the low-level concrete chemical entities into the high-level reactant groups without human effort for training data annotation. REACTCLASS is designed to take two special characteristics of the chemical molecules into consideration. The first characteristic is that each chemical molecule can be represented in two modalities: a chemical name in the text and a molecular structure in the graph. We propose to use cross-modal supervision to automatically create the training data for chemical name classification in the text via molecular structure matching in the graph. The second characteristic is that there is a knowledge-aware subword correlation between the surface names of the chemical entities to be classified and that of the reactant groups as class labels. We propose to train a classification model based on the subword cross-attention map between each chemical name and the corresponding reaction group. Experiments demonstrate that REACTCLASS is highly effective, achieving state-of-the-art performance in classifying the chemical names into human-defined reactant groups without requiring human effort for training data annotation.