ABSTRACT
We document speed-up gains of graphical processing unit (GPU) computing over central processing unit (CPU) for the estimation of discrete choice random coefficient demand model. When we use a moderate-sized GPU, the computation is six to twenty times faster, where the smallest speed-up factor, six, is obtained from a comparison with the parallel computing over sixteen CPU cores.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1 According to Google scholar citations, Berry (Citation1994) and Berry, Levinsohn, and Pakes (Citation1995) have attracted 2377 and 3558 citations, respectively, in July 2015.
2 In addition, we still use the CPU for the outer loop. Hence, our finding can be interpreted as a lower bound of speed up gains from GPU for BLP.
3 Notice that Z and X are, respectively, and matrices and and are vectors.
4 We use a fixed underlying random number, e.g. seed, for every considered in the optimization routine.
6 If we do not restrict to be diagonal, the number of parameters would be twenty. In this case, we find that the computing time on CPU is substantially longer, making our experiments in this section infeasible.
7 For the outer-loop (minimization of the GMM objective function), we use the default tolerance, , of the Matlab built-in solver fminunc following Dube, Fox, and Su (Citation2012). Dube, Fox, and Su (Citation2012) found that using a loose inner-loop tolerance often prevents the solver from converging to the minimum point. When we use , we found that not only the algorithm fails to converge, but also it stops at points at which the values of objective functions are hundreds times larger than the actual minimum values.