Distributed Uncertainty Quantification of Kernel Interpolation on Spheres

Authors:

(1) Sha-Bo Lin, Center for Intelligent Decision-Making and Machine Learning, School of Management, Xi’an Jiaotong University;

(2) Xingping Sun, Department of Mathematics, Missouri State University;

(3) Di Wang, §Center for Intelligent Decision-Making and Machine Learning, School of Management, Xi’an Jiaotong University.

Content Overview

Abstract & Introduction

Uncertainty Relation of Kernel Interpolation on Spheres

Distributed Kernel Interpolation

Operator Differences via Quadrature Rules

Proofs

Numerical Verifications

References

Abstract

For radial basis function (RBF) kernel interpolation of scattered data, Schaback [30] in 1995 proved that the attainable approximation error and the condition number of the underlying interpolation matrix cannot be made small simultaneously. He referred to this finding as an “uncertainty relation”, an undesirable consequence of which is that RBF kernel interpolation is susceptible to noisy data. In this paper, we propose and study a distributed interpolation method to manage and quantify the uncertainty brought on by interpolating noisy spherical data of non-negligible magnitude. We also present numerical simulation results showing that our method is practical and robust in terms of handling noisy data from challenging computing environments.

Key words. Kernel interpolation, distributed uncertainty mitigation, scattered spherical data

Uncertainty Relation of Kernel Interpolation on Spheres

3. Distributed Kernel Interpolation.

Corollary 2.2 shows that kernel interpolation performs poorly while confronting noisy data of non-negligible magnitude. To overcome this major drawback, we propose and study in this section a distributed kernel interpolation (DKI) method, which is motivated by the “distributed learning” in the literature [37, 19]. Figuratively speaking, this is a divide-and-conquer strategy for uncertainty quantification. To elaborate, we describe the method in three steps.

4. Operator Differences via Quadrature Rules.

In this section, we first briefly exposit an integral-operator approach initiated in [8] and then derive tight upper bounds for differences of operators of our interest, obtaining a certain type of Sobolev sampling inequalities [12] as a byproduct. Highlights of the section include Proposition 4.5) and Lemma 4.8.

6. Numerical Verifications

Four simulations are carried out in this section to verify the excellent performance of DKI. The first one shows that DKI succeeds in circumventing the uncertainty of kernel interpolation. The second one exhibits the role of m in DKI. The third one focuses on the role of the division strategy in DKI. The last one compares DKI with several popular spherical data fitting schemes including the distributed filtered hyperinterpolation (DFH) [21], sketching with s ∗ -designs [20], and distributed kernel ridge regression (DKRR) [8].

Simulation 2: In this simulation, we show the role of the parameter m in DKI. We generate 10014 training samples (with 141-designs as inputs). The number of divisions, m, ranges from {5, 10, · · · , 200}. Figure 6.2 shows the relation between RMSE of DKI and the number of local machines under different levels of Gaussian noise, provided that the total number of training samples is given. From Figure 6.2, we can conclude the following assertions: 1) For training samples with higher levels of noise, the testing RMSE generally decreases at first and then increases slowly as the number of local machines increases. Moderate values of m are more conductive to good approximation property for DKI. The reason is that too small m does not successfully address the uncertainty issue in kernel interpolation; too large m increases the fitting error, resulting in slightly worse generalization performance. 2) The optimal number m with the lowest RMSE grows with increasing Gaussian noise. This verifies the equation (3.3) of Theorem 3.2, in which the approximation error is primarily concerned with the sample error for large noise (i.e., large M) and can be reduced using a large m.

REFERENCES

[1] R. Bhatia, Matrix Analysis, vol. 169, Springer Science & Business Media, 2013.

[2] J. S. Brauchart and K. Hesse, Numerical integration over spheres of arbitrary dimension, Constructive Approximation, 25 (2007), pp. 41–71.

[3] G. Brown and F. Dai, Approximation of smooth functions on compact two-point homogeneous spaces, Journal of Functional Analysis, 220 (2005), pp. 401–423.

[4] A. Chernih, I. H. Sloan, and R. S. Womersley, Wendland functions with increasing smoothness converge to a gaussian, Advances in Computational Mathematics, 40 (2014), pp. 185– 200.

[5] F. Dai, Multivariate polynomial inequalities with respect to doubling weights and a∞ weights, Journal of Functional Analysis, 235 (2006), pp. 137–170.

[6] F. Dai, On generalized hyperinterpolation on the sphere, Proceedings of the American Mathematical Society, 134 (2006), pp. 2931–2941.

[7] H. W. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Problems, vol. 375, Springer Science & Business Media, 1996.

[8] H. Feng, S.-B. Lin, and D.-X. Zhou, Radial basis function approximation with distributively stored data on spheres, arXiv:2112.02499, (2021).

[9] T. Hangelbroek, F. J. Narcowich, X. Sun, and J. D. Ward, Kernel approximation on manifolds ii: The l∞ norm of the l2 projector, SIAM Journal on Mathematical Analysis, 43 (2011), pp. 662–684.

[10] T. Hangelbroek, F. J. Narcowich, and J. D. Ward, Kernel approximation on manifolds i: bounding the lebesgue constant, SIAM Journal on Mathematical Analysis, 42 (2010), pp. 1732–1760.

[11] K. Hesse, I. H. Sloan, and R. S. Womersley, Radial basis function approximation of noisy scattered data on the sphere, Numerische Mathematik, 137 (2017), pp. 579–605.

[12] K. Hesse, I. H. Sloan, and R. S. Womersley, Local rbf-based penalized least-squares approximation on the sphere with noisy scattered data, Journal of Computational and Applied Mathematics, 382 (2021), p. 113061.

[13] S. Hubbert, Q. T. Le Gia, and T. M. Morton ˆ , Spherical radial basis functions, theory and applications, Springer, 2015.

[14] M. A. King, R. J. Bingham, P. Moore, P. L. Whitehouse, M. J. Bentley, and G. A. Milne, Lower satellite-gravimetry estimates of antarctic sea-level contribution, Nature, 491 (2012), pp. 586–589.

[15] Q. T. Le Gia, F. J. Narcowich, J. D. Ward, and H. Wendland, Continuous and discrete least-squares approximation by radial basis functions on spheres, Journal of Approximation Theory, 143 (2006), pp. 124–133.

[16] P. Leopardi, A partition of the unit sphere into regions of equal area and small diameter, Electronic Transactions on Numerical Analysis, 25 (2006), pp. 309–327.

[17] J. Levesley, Z. Luo, and X. Sun, Norm estimates of interpolation matrices and their inverses associated with strictly positive definite functions, Proceedings of the American Mathematical Society, (1999), pp. 2127–2134.

[18] S.-B. Lin, X. Chang, and X. Sun, Kernel interpolation of high dimensional scattered data, arXiv preprint arXiv:2009.01514, (2020).

[19] S.-B. Lin, X. Guo, and D.-X. Zhou, Distributed learning with regularized least squares, The Journal of Machine Learning Research, 18 (2017), pp. 3202–3232.

[20] S.-B. Lin, D. Wang, and D.-X. Zhou, Sketching with spherical designs on spheres, SIAM Journal on Scientific Computation, in press (2023).

[21] S.-B. Lin, Y. G. Wang, and D.-X. Zhou, Distributed filtered hyperinterpolation for noisy data on the sphere, SIAM Journal on Numerical Analysis, 59 (2021), pp. 634–659.

[22] J. D. McEwen and Y. Wiaux, A novel sampling theorem on the sphere, IEEE Transactions on Signal Processing, 59 (2011), pp. 5876–5887.

[23] H. Mhaskar, F. Narcowich, and J. Ward, Spherical marcinkiewicz-zygmund inequalities and positive quadrature, Mathematics of Computation, 70 (2001), pp. 1113–1130.

[24] H. N. Mhaskar, Weighted quadrature formulas and approximation by zonal function networks on the sphere, Journal of Complexity, 22 (2006), pp. 348–370.

[25] C. Muller ¨ , Spherical Harmonics, vol. 17, Springer, 1966.

[26] F. J. Narcowich, N. Sivakumar, and J. D. Ward, Stability results for scattered-data interpolation on euclidean spheres, Advances in Computational Mathematics, 8 (1998), pp. 137– 163.

[27] F. J. Narcowich, X. Sun, J. D. Ward, and H. Wendland, Direct and inverse sobolev error estimates for scattered data interpolation via spherical basis functions, Foundations of Computational Mathematics, 7 (2007), pp. 369–390.

[28] F. J. Narcowich and J. D. Ward, Scattered data interpolation on spheres: error estimates and locally supported basis functions, SIAM Journal on Mathematical Analysis, 33 (2002), pp. 1393–1410.

[29] A. Rudi, R. Camoriano, and L. Rosasco, Less is more: Nystr¨om computational regularization., in NIPS, 2015, pp. 1657–1665.

[30] R. Schaback, Error estimates and condition numbers for radial basis function interpolation, Advances in Computational Mathematics, 3 (1995), pp. 251–264.

[31] S. Smale and D.-X. Zhou, Shannon sampling and function reconstruction from point values, Bulletin of the American Mathematical Society, 41 (2004), pp. 279–305.

[32] S. Smale and D.-X. Zhou, Shannon sampling ii: Connections to learning theory, Applied and Computational Harmonic Analysis, 19 (2005), pp. 285–302.

[33] Y.-T. Tsai and Z.-C. Shih, All-frequency precomputed radiance transfer using spherical radial basis functions and clustered tensor approximation, ACM Transactions on Graphics (TOG), 25 (2006), pp. 967–976.

[34] H. Wendland, Scattered data approximation, vol. 17, Cambridge university press, 2004.

[35] M. A. Wieczorek and R. J. Phillips, Potential anomalies on a sphere: Applications to the thickness of the lunar crust, Journal of Geophysical Research: Planets, 103 (1998), pp. 1715–1724.

[36] R. S. Womersley, Efficient spherical designs with good geometric properties, in Contemporary computational mathematics-A celebration of the 80th birthday of Ian Sloan, Springer, 2018, pp. 1243–1285.

[37] Y. Zhang, J. Duchi, and M. Wainwright, Divide and conquer kernel ridge regression: A distributed algorithm with minimax optimal rates, The Journal of Machine Learning Research, 16 (2015), pp. 3299–3340.

SM1. Appendix A: Select-and-Judge Strategy for Data Division. In this Appendix, we present the detailed implementation of the select-and-judge (SAJ) strategy. Our aim is to derive a series of subsets of similar Cardinality with separation radius not smaller than a given tolerance c0. There are two stages for SAJ.

This paper is available on arxiv under CC0 1.0 DEED license.