Abstract: We propose an improved estimator for the multi-task averaging problem, whose goal is the joint estimation of the means of multiple distributions using separate, independent data sets. The naive approach is to take the empirical mean of each data set individually, whereas the proposed method exploits similarities between tasks, without any related information being known in advance. First, for
each data set, similar or neighboring means are determined from the data by multiple testing. Then each naive estimator is shrunk towards the local average of its neighbors. This shrinkage echoes James Stein's estimator where the empirical mean is shrunk to zero. Here the reference point is not zero but the means of the neighbours. Although bias is added, the estimate is improved by reducing the variance.
This improvement can be signi cant when the dimension of the input space is large, demonstrating a \blessing of dimensionality" phenomenon. An application of this approach is the estimation of multiple kernel mean embeddings, which plays an important role in many modern applications. The theoretical results are veri ed on arti cial and real world data.
Keywords: Multiple estimations; Multiple tests; Stein's phenomenon; High dimension; Kernel Mean Embedding
Blanchard G., Fermanian J-B. (2021). Nonasymptotic one-and two-sample tests in high dimension with unknown covariance structure. ArXiv/ To appear in: Festschrift in the honor of V. Spokoiny.
Feldman, S., Gupta, M. R., and Frigyik, B. A. (2014). Revisiting Stein's paradox: multitask averaging. Journal of Machine Learning Research, 15(106):3621-3662.
Marienwald H., Fermanian J-B., Blanchard G. (2020). High-Dimensional Multi-Task Averaging and Application to Kernel Mean Embedding. AISTATS 2021. arXiv: 2011.06794 [stat.ML].
Stein, C. (1956). Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. Proceedings of the Third Berkeley symposium on mathematical statistics and probability. Vol. 1. No. 1.