학술논문

LTACL: long-tail awareness contrastive learning for distantly supervised relation extraction

Document Type

article

Author

Source

Complex & Intelligent Systems, Vol 10, Iss 1, Pp 1551-1563 (2023)

Subject

Distantly supervised learning
Information extraction
Relation extraction
Contrastive learning
Electronic computers. Computer science
QA75.5-76.95
Information technology
T58.5-58.64

Language

English

ISSN

2199-4536
2198-6053

Abstract

Abstract Distantly supervised relation extraction is an automatically annotating method for large corpora by classifying a bound of sentences with two same entities and the relation. Recent works exploit sound performance by adopting contrastive learning to efficiently obtain instance representations under the multi-instance learning framework. Though these methods weaken the impact of noisy labels, it ignores the long-tail distribution problem in distantly supervised sets and fails to capture the mutual information of different parts. We are thus motivated to tackle these issues and establishing a long-tail awareness contrastive learning method for efficiently utilizing the long-tail data. Our model treats major and tail parts differently by adopting hyper-augmentation strategies. Moreover, the model provides various views by constructing novel positive and negative pairs in contrastive learning for gaining a better representation between different parts. The experimental results on the NYT10 dataset demonstrate our model surpasses the existing SOTA by more than 2.61% AUC score on relation extraction. In manual evaluation datasets including NYT10m and Wiki20m, our method obtains competitive results by achieving 59.42% and 79.19% AUC scores on relation extraction, respectively. Extensive discussions further confirm the effectiveness of our approach.

Online Access

Full Text (ProQuest Central) Full Text (Gale Academic Onefile) Open Access (DOAJ) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송