In many fields of science, predicting variables of interest over a study region based on noisy data observed at some locations is an important problem. Two popular methods for the problem are kriging and smoothing splines. The former assumes that the underlying process is stochastic, whereas the latter assumes it is purely deterministic. Kriging performs better than smoothing splines in some situations, but is outperformed by smoothing splines in others. However, little is known regarding selecting between kriging and smoothing splines. In addition, how to perform variable selection in a geostatistical model has not been well studied. In this article we propose a general methodology for selecting among arbitrary spatial prediction methods based on (approximately) unbiased estimation of mean squared prediction errors using a data perturbation technique. The proposed method accounts for estimation uncertainty in both kriging and smoothing spline predictors, and is shown to be optimal in terms of two mean squared prediction error criteria. A simulation experiment is performed to demonstrate the effectiveness of the proposed methodology. The proposed method is also applied to a water acidity data set by selecting important variables responsible for water acidity based on a spatial regression model. Moreover, a new method is proposed for estimating the noise variance that is robust and performs better than some well-known methods.
- Data perturbation
- Generalized degrees of freedom
- Mean squared prediction error
- Noise variance estimation
- Smoothing spline
- Spatial prediction
- Variable selection