학술논문

Application of feature selection methods and machine learning algorithms for saltmarsh biomass estimation using Worldview-2 imagery.
Document Type
Article
Source
Geocarto International. Jun2021, Vol. 36 Issue 10, p1075-1099. 25p.
Subject
*BIOMASS estimation
*MULTISPECTRAL imaging
*FEATURE selection
*MACHINE learning
*RANDOM forest algorithms
*NORMALIZED difference vegetation index
*SUPPORT vector machines
Language
ISSN
1010-6049
Abstract
Assessing large scale plant productivity of coastal marshes is essential to understand the resilience of these systems to climate change. Two machine learning approaches, random forest (RF) and support vector machine (SVM) regression were tested to estimate biomass of a common saltmarshes species, salt couch grass (Sporobolus virginicus). Reflectance and vegetation indices derived from 8 bands of Worldview-2 multispectral data were used for four experiments to develop the biomass model. These four experiments were, Experiment-1: 8 bands of Worldview-2 image, Experiment-2: Possible combination of all bands of Worldview-2 for Normalized Difference Vegetation Index (NDVI) type vegetation indices, Experiment-3: Combination of bands and vegetation indices, Experiment-4: Selected variables derived from experiment-3 using variable selection methods. The main objectives of this study are (i) to recommend an affordable low cost data source to predict biomass of a common saltmarshes species, (ii) to suggest a variable selection method suitable for multispectral data, (iii) to assess the performance of RF and SVM for the biomass prediction model. Cross-validation of parameter optimizations for SVM showed that optimized parameter of ɛ-SVR failed to provide a reliable prediction. Hence, ν-SVR was used for the SVM model. Among the different variable selection methods, recursive feature elimination (RFE) selected a minimum number of variables (only 4) with an RMSE of 0.211 (kg/m2). Experiment-4 (only selected bands) provided the best results for both of the machine learning regression methods, RF (R2= 0.72, RMSE= 0.166 kg/m2) and SVR (R2= 0.66, RMSE = 0.200 kg/m2) to predict biomass. When a 10-fold cross validation of the RF model was compared with a 10-fold cross validation of SVR, a significant difference (p = <0.0001) was observed for RMSE. One to one comparisons of actual to predicted biomass showed that RF underestimates the high biomass values, whereas SVR overestimates the values; this suggests a need for further investigation and refinement. [ABSTRACT FROM AUTHOR]