Abstract
Boosting is one of the most popular and powerful learning algorithms. However, due to its sequential nature in model fitting, the computational time of boosting algorithm can be prohibitive for big data analysis. In this paper, we proposed a parallel framework for boosting algorithm, called Ensemble of Fast Learning Stochastic Gradient Boosting (EFLSGB). The proposed EFLSGB is well suited for parallel execution, and therefore, can substantially reduce the computational time. Analysis of simulated and real datasets demonstrates that EFLSGB achieves highly competitive prediction accuracy in comparison with gradient tree boosting.
Acknowledgment
Portions of this research were conducted with high performance computational resources provided by the Louisiana Optical Network Infrastructure (http://www.loni.org).