학술논문

Feature Selection in Mobile Activity Recognition: A Comparative Study
Document Type
Conference
Source
2021 22nd IEEE International Conference on Mobile Data Management (MDM) MDM Mobile Data Management (MDM), 2021 22nd IEEE International Conference on. :181-186 Jun, 2021
Subject
Computing and Processing
Activity recognition
Sensor phenomena and characterization
Feature extraction
Stability analysis
Robustness
Data models
Sensors
human activity recognition
mobile sensor data
machine learning
feature selection
Language
ISSN
2375-0324
Abstract
Mobile sensor-based activity recognition is a growing research field with important applications areas, such as healthcare and well-being. Data collected from multiple sensors, including smartphone sensors that are now ubiquitous, can be exploited to build predictive models capable of recognizing actions and activities performed by humans in their daily life. This involves several processing steps, from the cleansing of raw data to the extraction of suitable features and the induction of proper classifiers. Dimensionality reduction techniques may also be important for the efficiency and the exploitability of the induced models, especially when dealing with multi-sensor data leading to high-dimensional feature vectors. In such a scenario, feature selection algorithms can be very useful to identify and retain only the most informative and predictive features. However, little research has so far investigated which selection approaches may be most appropriate in sensor-based activity recognition tasks. To give a contribution in this direction, our paper compares the performance of different feature selection methods, both univariate and multivariate, on a public domain benchmark containing smartphone sensor data. We performed a comprehensive evaluation considering the extent to which each method effectively identifies the most predictive features and the overall stability of the selection process, i.e., its robustness to changes in the input data. Our results give interesting insight on which methods may be most suited in this domain, showing that it is possible to significantly reduce the data dimensionality without compromising the activity recognition performance.