학술논문

Driver Distraction Classification Using Deep Convolutional Autoencoder and Ensemble Learning
Document Type
Periodical
Source
IEEE Access Access, IEEE. 11:71435-71448 2023
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Vehicles
Feature extraction
Deep learning
Vehicle dynamics
Computational modeling
Behavioral sciences
Roads
Convolutional neural networks
driver distraction
dynamic ensemble classification
CNNs
autoencoders
Language
ISSN
2169-3536
Abstract
The study of real-time classification for driver distraction provides new insights into the understanding of behavioral and cognitive reasons behind it. Among various approaches, deep learning models show better performance and can be utilized for a real-time classification system. However, deep learning models suffer from generalization performance. Ensemble learning, on the other hand, can combine numerous classification models to improve generalization. Hence, a deep ensemble learning model would be advantageous to improve the overall performance of driver distraction classification. Approaches using ensemble learning are relatively scarce in this research field. This paper proposes an ensemble framework where a novel dynamic ensemble learning is used to classify driver distraction based on autoencoders and a set of popular convolutional neural network (CNN) architectures. The framework has also been tested on two other ensemble techniques, namely average-weighted-ensemble and grid-search-ensemble, for driver distraction classification. Our ensemble framework uses VGG-19 , ResNet-50 , and DenseNet-121 CNN models with the pre-trained weights from the ImageNet dataset for the ensemble network. Hyperparameter tuning on the classification heads of the baseline models is performed to get the most optimum performance. We used two open-source datasets, the State Farm Driver Distraction Dataset (SF3D) and the Multimodal Multiview and Multispectral Driver Action Dataset (3MDAD), with a combined size of over 80,000 images consisting of 10 categories of driver distractions that allowed increased reliability of the ensemble model in different situations and lighting conditions. The models were fine-tuned using transfer learning techniques. The experimental results showed that the accuracy of the average weighted ensemble, grid search ensemble, and the dynamic ensemble were 88.91%, 89.04%, and 89.13%, respectively, which is higher than the individual baseline models.