학술논문

How Many Data Does Machine Learning in Human–Computer Interaction Need?: Re-Estimating the Dataset Size for Convolutional Neural Network-Based Models of Visual Perception

Document Type

Periodical

Author

Bakaev, M.; Heil, S.; Khvorostov, V.; Gaedke, M.

Source

IT Professional IT Prof. IT Professional. 25(2):23-29 Apr, 2023

Subject

Computing and Processing
Engineering Profession
Components, Circuits, Devices and Systems
Power, Energy and Industry Applications
Training
Training data
Estimation
Machine learning
Mean square error methods
Data models
Planning

Language

ISSN

1520-9202
1941-045X

Abstract

Artificial intelligence (AI)-based user-interface (UI) design and evaluation are currently constrained by the scarcity of human-generated training data. Correspondingly, choosing appropriate neural network (NN) architecture and carefully planning the sample size is essential for building accurate machine learning models. Previously, we have estimated that for a convolutional NN to produce better mean square errors (MSEs) than feature-based models, the required training dataset size should be approximately 3000. Our current validation study with roughly 4000 web UIs and 233 subjects suggests that the estimation should be closer to 17,000. We propose corrected regression models suggesting that the dataset size effect is better described using a logarithmic function. We also report significant differences in MSEs between the employed perception dimensions, with Aesthetics models having an MSE 21.5% worse than Complexity and 12.1% worse than Orderliness.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송