학술논문

Sentiment analysis on Chinese movie review with distributed keyword vector representation
Document Type
Conference
Source
2016 Conference on Technologies and Applications of Artificial Intelligence (TAAI) Technologies and Applications of Artificial Intelligence (TAAI), 2016 Conference on. :84-89 Nov, 2016
Subject
Computing and Processing
Robotics and Control Systems
Motion pictures
Support vector machines
Training
Sentiment analysis
Testing
Semantics
Urban areas
sentiment analysis
machine learning
TF-IDF
LLR
word embedding
Language
ISSN
2376-6824
Abstract
In the area of national language processing, performing machine learning technique on customer or movie review for sentiment analysis has been? frequently tried. While methods such as? support vector machine (SVM) were much favored in the 2000s, recently there is a steadily rising percentage of implementation with vector representation and artificial neural network. In this article we present an approach to implement word embedding method to conduct sentiment analysis on movie review from a renowned bulletin board system forum in Taiwan. After performing log-likelihood ratio (LLR) on the corpus and selecting the top 10000 most related keywords as representative vectors for different sentiments, we use these vectors as the sentiment classifier for the testing set. We achieved results that are not only comparable to traditional methods like Naïve Bayes and SVM, but also outperform Latent Dirichlet Allocation, TF-IDF and its variant. It also tops the original LLR with a substantial margin.