1,221
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Understanding Bias and Variance of Learning-to-Rank Algorithms: An Empirical Framework

ORCID Icon
Article: 2009164 | Received 15 Mar 2021, Accepted 15 Nov 2021, Published online: 11 Dec 2021

References

  • Al-Maskari, A., M. Sanderson, and P. Clough (2007). The relationship between ir effectiveness measures and user satisfaction. In Proceedings of the 30th annual international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 773–759. Amsterdam, Netherlands: ACM.
  • Bernard, S., L. Heutte, and S. Adam. 2009. Towards a better understanding of random forests through the study of strength and correlation, In Emerging intelligent computing technology and applications. Berlin, Heidelberg: With Aspects of Artificial Intelligence, pp. 536–545. Springer.
  • Brain, D., and G. I. Webb. 2002. The need for low bias algorithms in classification learning from large data sets. In Principles of data mining and knowledge discovery, 62–73. Berlin, Heidelberg: Springer.
  • Breiman, L. 1996. Bagging predictors. Machine Learning 24 (2):123–40. doi:10.1007/BF00058655.
  • Breiman, L. 2001. Random forests. Machine Learning 45 (1):5–32.
  • Carterette, B. (2009). On rank correlation and the distance between rankings. In Proceedings of the 32nd international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 436–43. Boston MA, USA: ACM.
  • Chapelle, O., and Y. Chang. 2011. Yahoo! learning to rank challenge overview. Journal of Machine Learning Research-Proceedings Track 14:1–24.
  • Cossock, D., and T. Zhang. 2006. Subset ranking using regression 19th Annual Conference on Learning Theory 2006 (Springer) Pittsburgh PA, USA. 605–619.
  • Diaconis, P., and R. L. Graham. 1977. Spearman’s footrule as a measure of disarray. Journal of the Royal Statistical Society: Series B (Methodological) 39 (2):262–68.
  • Domingos, P. 2000 A unified bias-variance decomposition Proceedings of the 17th International Conference on Machine Learning (ICML). San Francisco CA: Morgan Kaufmann: Pp. 231–38.
  • Dwork, C., R. Kumar, M. Naor, and D. Sivakumar (2001). Rank aggregation methods for the web. In Proceedings of the 10th International Confference on World Wide Web, Hong Kong, pp. 613–22. ACM.
  • Friedman, J. H., and P. Hall. 2007. On bagging and nonlinear estimation. Journal of Statistical Planning and Inference 137 (3):669–83.
  • Friedman, J. H. 1997. On bias, variance, 0/1—loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1 (1):55–77.
  • Friedman, J. H. 2001. Greedy function approximation: A gradient boosting machine. (English summary). Annals of Statistics 29 (5):1189–232.
  • Ganjisaffar, Y., R. Caruana, and C. V. Lopes (2011). Bagging gradient-boosted trees for high precision, low variance ranking models. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval Beijing, China, pp. 85–94. ACM.
  • Ganjisaffar, Y., T. Debeauvais, S. Javanmardi, R. Caruana, and C. V. Lopes (2011). Distributed tuning of machine learning algorithms using mapreduce clusters. In Proceedings of the 3rd Workshop on Large Scale Data Mining: Theory and Applications, San Diego CA, USA, pp. 2. ACM.
  • Geman, S., E. Bienenstock, and R. Doursat. 1992. Neural networks and the bias/variance dilemma. Neural Computation 4 (1):1–58.
  • Geurts, P., and G. Louppe (2011). Learning to rank with extremely randomized trees. In JMLR: Workshop and Conference Proceedings, Boston MA, USA, Volume 14.
  • Geurts, P. 2005. Bias vs variance decomposition for regression and classification. In Data mining and knowledge discovery handbook, 749–63. Boston MA, USA: Springer.
  • Ghosal, I., and G. Hooker. 2020. Boosting random forests to reduce bias; one-step boosted forest and its variance estimate. Journal of Computational and Graphical Statistics 30 (2):493–502.
  • Hastie, T., R. Tibshirani, and J. Friedman (2009). The elements of statistical learning.
  • Hofmann, K., S. Whiteson, and M. de Rijke. 2013. Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval. Information Retrieval 16 (1):63–90.
  • Horváth, M. Z., M. N. Müller, M. Fischer, and M. Vechev. 2021. Boosting randomized smoothing with variance reduced classifiers. arXiv Preprint arXiv:2106 06946.
  • Ibrahim, M., and M. Carman. 2016. Comparing pointwise and listwise objective functions for random-forest-based learning-to-rank. ACM Transactions on Information Systems (TOIS) 34 (4):20.
  • Ibrahim, M., and M. Murshed. 2016. From tf-idf to learning-to-rank: An overview. In Handbook of research on innovations in information retrieval, analysis, and management, Martins, J. T., and Molnar, A., edited by, 62–109. USA: IGI Global.
  • Ibrahim, M. 2019. Reducing correlation of random forest–based learning-to-rank algorithms using subsample size. Computational Intelligence 35 (4):774–98.
  • Ibrahim, M. 2020. An empirical comparison of random forest-based and other learning-to-rank algorithms. Pattern Analysis and Applications 23 (3):1133–55.
  • James, G. M. 2003. Variance and bias for general loss functions. Machine Learning 51 (2):115–35.
  • Järvelin, K., and J. Kekäläinen (2000). Ir evaluation methods for retrieving highly relevant documents. In Proceedings of the 23rd annual international ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, pp. 41–48. ACM.
  • Jones, T., P. Thomas, F. Scholer, and M. Sanderson (2015). Features of disagreement between retrieval effectiveness measures. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval Santiago, Chile, USA: ACM, 847–850.
  • Kohavi, R., Wolpert, D. H., et al. 1996 Bias plus variance decomposition for zero-one loss functions, Proceedings of the 13th International Conference on Machine Learning (ICML), Italy USA: Morgan Kaufmann. In , 275–283.
  • Kong, E. B., and T. G. Dietterich. 1995. Error-correcting output coding corrects bias and variance. In International Confference on Machine Learning (ICML), USA: Morgan Kaufmann, 313–321.
  • Lan, Y., T.-Y. Liu, T. Qin, Z. Ma, and H. Li (2008). Query-level stability and generalization in learning to rank. In Proceedings of the 25th International Confference on Machine Learning, Helsinki, Finland, pp. 512–19. ACM.
  • Lan, Y., T.-Y. Liu, Z. Ma, and H. Li (2009). Generalization analysis of listwise learning-to-rank algorithms. In Proceedings of the 26th Annual International Confference on Machine Learning, Montreal, Canada, pp. 577–84. ACM.
  • Lapata, M. 2006. Automatic evaluation of information ordering: Kendall’s tau. Computational Linguistics 32 (4):471–84.
  • Lin, Y., and Y. Jeon. 2006. Random forests and adaptive nearest neighbors. Journal of the American Statistical Association 101 (474):578–90.
  • Liu, T.-Y. 2011. Learning to rank for information retrieval. Berlin Heidelberg: Springerverlag.
  • Manning, C. D., P. Raghavan, and H. Schütze. 2008. Introduction to information retrieval, vol. 1. Cambridge, UK: Cambridge University Press.
  • Moffat, A. 2013. Seven numeric properties of effectiveness metrics. In Information retrieval technology, 1–12. Berlin, Heidelberg: Springer.
  • Mohan, A., Z. Chen, and K. Q. Weinberger. 2011. Web-search ranking with initialized gradient boosted regression trees. Journal of Machine Learning Research-Proceedings Track 14:77–89.
  • Qin, T., T.-Y. Liu, J. Xu, and H. Li. 2010. Letor: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval 13 (4):346–74.
  • Quoc, C., and V. Le. 2007. Learning to rank with nonsmooth cost functions. Proceedings of the Advances in Neural Information Processing Systems 19:193–200.
  • Sanderson, M., M. L. Paramita, P. Clough, and E. Kanoulas (2010). Do user preferences and evaluation measures line up? In Proceedings of the 33rd international ACM SIGIR Conference on Research and Development in Information Retrieval, Geneva, Switzerland, pp. 555–62. ACM.
  • Sculley, D. (2007). Rank aggregation for similar items. In SIAM International Conference on Data Mining (SDM), Philadelphia PA, USA, pp. 587–592. SIAM.
  • Segal, M. R. (2004). Machine learning benchmarks and random forest regression.
  • Sexton, J., and P. Laake. 2009. Standard errors for bagged and random forest estimators. Computational Statistics & Data Analysis 53 (3):801–11.
  • Voorhees, E. M. 2000. Variations in relevance judgments and the measurement of retrieval effectiveness. Information Processing & Management 36 (5):697–716.
  • Wager, S., T. Hastie, and B. Efron. 2014. Confidence intervals for random forests: The jackknife and the infinitesimal jackknife. The Journal of Machine Learning Research 15 (1):1625–51.
  • Wu, Q., C. J. Burges, K. M. Svore, and J. Gao. 2010. Adapting boosting for information retrieval measures. Information Retrieval 13 (3):254–70.
  • Xia, F., T.-Y. Liu, and H. Li. 2009. Top-k consistency of learning to rank methods. Advances in Neural Information Processing Systems 22:2098–106.
  • Xia, F., T.-Y. Liu, J. Wang, W. Zhang, and H. Li (2008). Listwise approach to learning to rank: Theory and algorithm. In Proceedings of the 25th International Confference on Machine Learning, Helsinki, Finland, pp. 1192–99. ACM.
  • Yilmaz, E., J. A. Aslam, and S. Robertson (2008). A new rank correlation coefficient for information retrieval. In Proceedings of the 31st annual international ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore, pp. 587–94. ACM.
  • Zhang, G., and Y. Lu. 2012. Bias-corrected random forests in regression. Journal of Applied Statistics 39 (1):151–60.