학술논문

Sign-constrained linear regression for prediction of microbe concentration based on water quality datasets.
Document Type
Article
Source
Journal of Water & Health. Jun2019, Vol. 17 Issue 3, p404-415. 12p.
Subject
*WATER quality
*FORECASTING
*WATER pollution
*POLLUTANTS
*STATISTICAL correlation
*SCIENTIFIC knowledge
*INTRACLASS correlation
Language
ISSN
1477-8920
Abstract
This study presents a novel methodology for estimating the concentration of environmental pollutants in water, such as pathogens, based on environmental parameters. The scientific uniqueness of this study is the prevention of excess conformity in the model fitting by applying domain knowledge, which is the accumulated scientific knowledge regarding the correlations between response and explanatory variables. Sign constraints were used to express domain knowledge, and the effect of the sign constraints on the prediction performance using censored datasets was investigated. As a result, we confirmed that sign constraints made prediction more accurate compared to conventional sign-free approaches. The most remarkable technical contribution of this study is the finding that the sign constraints can be incorporated in the estimation of the correlation coefficient in Tobit analysis. We developed effective and numerically stable algorithms for fitting a model to datasets under the sign constraints. This novel algorithm is applicable to a wide variety of the prediction of pollutant contamination level, including the pathogen concentrations in water. [ABSTRACT FROM AUTHOR]