학술논문

A sentiment analysis approach to improve authorship identification.
Document Type
Article
Source
Expert Systems. Aug2021, Vol. 38 Issue 5, p1-12. 12p.
Subject
*SENTIMENT analysis
*SOCIAL media
*AUTHORSHIP
*MACHINE learning
*FAKE news
Language
ISSN
0266-4720
Abstract
Writing style is considered the manner in which an author expresses his thoughts, influenced by language characteristics, period, school, or nation. Often, this writing style can identify the author. One of the most famous examples comes from 1914 in Portuguese literature. With Fernando Pessoa and his heteronyms Alberto Caeiro, Álvaro de Campos, and Ricardo Reis, who had completely different writing styles, led people to believe that they were different individuals. Currently, the discussion of authorship identification is more relevant because of the considerable amount of widespread fake news in social media, in which it is hard to identify who authored a text and even a simple quote can impact the public image of an author, especially if these texts or quotes are from politicians. This paper presents a process to analyse the emotion contained in social media messages such as Facebook to identify the author's emotional profile and use it to improve the ability to predict the author of the message. Using preprocessing techniques, lexicon‐based approaches, and machine learning, we achieved an authorship identification improvement of approximately 5% in the whole dataset and more than 50% in specific authors when considering the emotional profile on the writing style, thus increasing the ability to identify the author of a text by considering only the author's emotional profile, previously detected from prior texts. [ABSTRACT FROM AUTHOR]