Abstract
This paper presents an extension of the random forest (RF) method to the case of clustered data. The proposed ‘mixed-effects random forest’ (MERF) is implemented using a standard RF algorithm within the framework of the expectation–maximization algorithm. Simulation results show that the proposed MERF method provides substantial improvements over standard RF when the random effects are non-negligible. The use of the method is illustrated to predict the first-week box office revenues of movies.
Acknowledgements
This research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) and by Le Fonds québécois de la recherche sur la nature et les technologies (FQRNT). The authors thank a reviewer for constructive and pertinent comments. They want to thank the Carmelle and Rémi Marcoux Chair in Arts Management for providing the movie box office data used in the example, Renaud Legoux for interesting discussions, and Mohamed Jendoubi for preparing the data set.