학술논문

Estimation of Optimal Number of Topics Detection and their Assessment Using Hybrid Topic Models for Visualization Enhancement
Document Type
Conference
Source
2023 5th International Conference on Inventive Research in Computing Applications (ICIRCA) Inventive Research in Computing Applications (ICIRCA), 2023 5th International Conference on. :1137-1145 Aug, 2023
Subject
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Robotics and Control Systems
Measurement
Visualization
Computational modeling
Estimation
Coherence
Reliability
Optimal number of topics
Topic modelling
Hybrid topic models
Twitter datasets
Cluster validity indices
Language
Abstract
For more than a decade, topic models have been extensively utilized to create document clusters, but there have been issues with selecting the right number of subjects. The fundamental issue is the absence of a reliable indicator of the topic quality obtained during topic model creation. In estimating the number of topics non-parametric methodology and quality coherence and perplexity methods were applied in previous work. To complement clustering and select the most relevant topics, the parametric technique with Visual Assessment Tendency (VAT) has been employed in this study. In this work, the detection of optimal number of topics on different types of extracted datasets is done by using hybrid topic models. Some of key challenges such as selecting optimal topics to improve clustering accuracy and visualization, and usage of appropriate metrics in analysis have been implemented and the results on different types of datasets are recorded. Results are represented in the form of tables, graphs, and VAT images. Comparative study and assessment were carried out and the results are represented.