학술논문

Semantic Clustering and Transfer Learning in Social Media Texts Authorship Attribution

Document Type

article

Author

Anastasia Fedotova; Anna Kurtukova; Aleksandr Romanov; Alexander Shelupanov

Source

IEEE Access, Vol 12, Pp 39783-39803 (2024)

Subject

Authorship attribution
machine learning
natural language processing
semantic clustering
transfer learning
Electrical engineering. Electronics. Nuclear engineering
TK1-9971

Language

English

ISSN

2169-3536

Abstract

This paper is the fourth part of a research series that focuses on determining the authorship of Russian-language texts by analyzing short social media comments, including those from mass media and communities associated with destructive content. Semantic text clustering was used to analyze content and employed a transfer learning technique based on a pre-trained model to identify sensitive topics. Authorship attribution is implemented as a classical classification task with a closed set of authors and a more challenging open-set task. In the latter case, multiple experiments were conducted, incorporating the identification of destructive content with known authors and artificially generated texts. For open attribution, a method combining One-Class SVM and fastText was proposed. Results demonstrate high accuracy (92% or higher) for cases with 2 and 5 authors, regardless of comment length and the additional task of identifying authors of destructive text. Mixed-data experiments involving 10 or more authors yielded results comparable to or more accurate (84% or higher) than previous studies.

Online Access

Open Access (DOAJ) Open Access (EBSCO) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송