학술논문

Kernel Density Estimation with Linked Boundary Conditions
Document Type
Working Paper
Source
Studies in Applied Mathematics 145 (2020) 357-396
Subject
Mathematics - Statistics Theory
35Q92, 62G07, 35A22, 47N40, 35K08
Language
Abstract
Kernel density estimation on a finite interval poses an outstanding challenge because of the well-recognized bias at the boundaries of the interval. Motivated by an application in cancer research, we consider a boundary constraint linking the values of the unknown target density function at the boundaries. We provide a kernel density estimator (KDE) that successfully incorporates this linked boundary condition, leading to a non-self-adjoint diffusion process and expansions in non-separable generalized eigenfunctions. The solution is rigorously analyzed through an integral representation given by the unified transform (or Fokas method). The new KDE possesses many desirable properties, such as consistency, asymptotically negligible bias at the boundaries, and an increased rate of approximation, as measured by the AMISE. We apply our method to the motivating example in biology and provide numerical experiments with synthetic data, including comparisons with state-of-the-art KDEs (which currently cannot handle linked boundary constraints). Results suggest that the new method is fast and accurate. Furthermore, we demonstrate how to build statistical estimators of the boundary conditions satisfied by the target function without apriori knowledge. Our analysis can also be extended to more general boundary conditions that may be encountered in applications.