References
- Horvitz DG, Thompson DJ. A generalization of sampling without replacement from a finite universe. JASA. 1951;47:663–685.
- Berger YG. Rate of convergence to normal distribution for the Horvitz-Thompson estimator. J Statist Plann Inference. 1998;67(2):209–226.
- Deville J-C, Särndal C-E. Calibration estimators in survey sampling. JASA. 1992;87:376–382.
- Hajek J. Asymptotic theory of rejective sampling with varying probabilities from a finite population. Ann Math Stat. 1964;35(4):1491–1523.
- Robinson PM. On the convergence of the Horvitz-Thompson estimator. Aust J Stat. 1982;24(2):234–238.
- Rosen P. Asymptotic theory for successive sampling. AMS. 1972;43:373–397.
- Breslow NE, Lumley T, Ballantyne CM, et al. Improved Horvitz-Thompson estimation of model parameters from two-phase stratified samples: applications in epidemiology. Stat Biosci. 2009;1:32–49.
- Breslow NE, Wellner JA. Weighted likelihood for semiparametric models and two-phase stratified samples, with application to Cox regression. Scand J Stat. 2007;35:186–192.
- Breslow NE, Wellner JA. A Z-theorem with estimated nuisance parameters and correction note for ‘Weighted likelihood for semiparametric models and two-phase stratified samples, with application to Cox regression’. Scand J Stat. 2008;35:186–192.
- Gill RD, Vardi Y, Wellner JA. Large sample theory of empirical distributions in biased sampling models. Ann Stat. 1988;16(3):1069–1112.
- Saegusa T, Wellner J. Weighted likelihood estimation under two-phase sampling. Ann Stat. 2013;41(1):269–295.
- Boucheron S, Bousquet O, Lugosi G. Theory of classification: a survey of some recent advances. ESAIM: Probab Stat. 2005;9:323–375.
- Devroye L, Györfi L, Lugosi G. A probabilistic theory of pattern recognition. New York: Springer; 1996.
- Koltchinskii V. Local Rademacher complexities and oracle inequalities in risk minimization (with discussion). Ann Stat. 2006;34:2593–2656.
- Clémençon S, Bertail P, Chautru E. Scaling-up M-estimation via sampling designs: the Horvitz-Thompson stochastic gradient descent. Proceedings of the 2014 IEEE International Conference on Big Data; Washington, USA; 2014.
- Bottou L, Bousquet O. The trade-offs of large-scale learning. In: Platt J, Koller D, Singer Y, Roweis S, editors. Advances in neural information processing systems 20. Proceedings of NIPS'07; Vancouver, B.C., Canada; 2008. p. 161–168.
- Steinwart I, Hush D, Scovel C. Learning from dependent observations. J Multivariate Anal. 2009;100(1):175–194.
- Agarwal A, Duchi JC. The generalization ability of online algorithms for dependent data. IEEE Trans Inf Theory. 2013;59(1):573–587.
- Cochran W. Sampling techniques. New York: Wiley; 1977.
- Deville J. Réplications d'échantillons, demi-échantillons, Jackknife, bootstrap dans les sondages. Economica, Ed. Droesbeke, Tassi, Fichet; 1987.
- Särndal C, Swensson B, Wretman J. Model assisted survey sampling. New York: Springer-Verlag; 1992. (Springer Series in Statistics).
- Bartlett PL, Jordan MI, McAuliffe JD. Convexity, classification, and risk bounds. J Am Statist Assoc. 2006;101(473):138–156.
- Bertail P, Chautru E, Clémençon S. Empirical processes in survey sampling with (conditional) poisson designs. Submitted to the Scand J Stat. 2016. DOI:10.1111/sjos.12243
- Clémençon S, Bertail P, Chautru E, et al. Survey schemes for stochastic gradient descent with applications to M-estimation. Submitted for publication. Available from: http://arxiv.org/abs/1501.02218
- van der Vaart A, Wellner J. Weak convergence and empirical processes. New York: Springer; 1996.
- Tsybakov A. Introduction à l'estimation non-paramétrique. New York: Springer; 2004. (Mathématiques et Applications).
- Lugosi G, Zeger K. Concept learning using complexity regularization. IEEE Trans Inf Theory. 1996;42:48–54.