Kriging is a geostatistical interpolation technique used to predict unknown values at unmeasured locations based on known data points. Unlike classical interpolation methods that rely solely on distance, kriging incorporates the spatial autocorrelation structure of the data, providing optimal, unbiased estimates with quantified uncertainty. The term is pronounced "cree-king" or "krye-king" and honors the South African mining engineer Danie G. Krige.
Interpolation, in a broader mathematical sense, refers to the process of estimating unknown values within the range of known data points. When extended to spatial domains, interpolation becomes essential for mapping continuous phenomena such as temperature, mineral concentration, or pollutant levels. Kriging distinguishes itself by minimizing estimation variance under the constraint of unbiasedness, making it the "Best Linear Unbiased Predictor" (BLUP).
Historical Context
The foundations of kriging trace back to the 1960s. Danie G. Krige, a mining engineer in South Africa, empirically developed methods to estimate gold ore reserves by accounting for spatial relationships between sample points. However, his approach lacked a formal theoretical framework.
In 1962, French mathematician Georges Matheron formalized Krige's empirical work using geostatistics and random function theory. Matheron introduced the concept of intrinsic random functions and provided the mathematical rigor that transformed kriging from a mining heuristic into a universal statistical tool. The technique rapidly spread across environmental science, meteorology, and later, machine learning.
Mathematical Foundation
At its core, kriging computes a weighted linear combination of observed values:
Where Z̃(s₀) is the predicted value at location s₀, Z(sᵢ) are observed values at known locations, and λᵢ are the kriging weights. The weights are determined by minimizing the estimation variance, which depends on the spatial correlation structure modeled by the semivariogram:
The semivariogram quantifies how similarity between data points decreases with distance h. Fitting an appropriate variogram model (e.g., spherical, exponential, Gaussian) is critical, as it directly governs the kriging weights and prediction reliability.
Note: Kriging is mathematically equivalent to Gaussian Process regression in machine learning. Both methods use covariance functions to model spatial/temporal dependencies and provide predictive distributions.
Types of Kriging
Several variants exist, tailored to different data structures and assumptions:
| Type | Assumptions | Use Case |
|---|---|---|
| Ordinary Kriging | Unknown but constant mean (stationarity) | General spatial prediction |
| Simple Kriging | Known constant mean, stationary covariance | Theoretical modeling, simulation |
| Universal Kriging | Non-stationary mean (linear trend/drift) | Data with systematic gradients |
| Co-Kriging | Multiple correlated variables | Remote sensing, multi-sensor fusion |
| Indicator Kriging | Binary/categorical data | Probability mapping, risk assessment |
Applications
Kriging has become indispensable across disciplines requiring spatial prediction:
- Mining & Geology: Ore reserve estimation, grade control, and resource valuation.
- Environmental Science: Mapping soil contamination, groundwater quality, and pollutant dispersion.
- Meteorology & Climatology: Interpolating temperature, precipitation, and atmospheric pressure from sparse station networks.
- Geographic Information Systems (GIS): Generating continuous surfaces from discrete point data for urban planning and ecology.
- Machine Learning: As Gaussian Processes for Bayesian optimization, hyperparameter tuning, and surrogate modeling.
Modern computational advances, including parallelized kriging solvers and sparse covariance approximations, have enabled the technique to scale to millions of data points, bridging traditional geostatistics with big data analytics.
Advantages & Limitations
Advantages:
- Provides optimal predictions (minimum variance) under model assumptions.
- Quantifies uncertainty via kriging variance, enabling confidence mapping.
- Handles irregularly spaced data and complex sampling designs.
- Mathematically rigorous with strong theoretical guarantees.
Limitations:
- Computationally intensive for large datasets (O(n³) complexity for standard kriging).
- Sensitive to variogram model misspecification; poor fitting leads to biased weights.
- Assumes stationarity (or requires trend modeling), which may not hold in highly heterogeneous landscapes.
- Struggles with non-linear relationships without transformation or machine learning hybrids.
Further Reading
- Cressie, N. (1993). Statistics for Spatial Data. Wiley-Interscience.
- Goovaerts, P. (1997). Geostatistics for Natural Resources Evaluation. Oxford University Press.
- Matheron, G. (1962). "Traité de géostatistique appliquée, Tome 1." École des Mines de Paris.
- Ripley, B. D. (1981). "Spatial Variations." Journal of the Royal Statistical Society, Series B, 39(2), 172–198.
- Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. MIT Press.