Abstract
We investigate the computation of Hessian matrices via Automatic Differentiation, using a graph model and an algebraic model. The graph model reveals the inherent symmetries involved in calculating the Hessian. The algebraic model, based on Griewank and Walther's [Evaluating derivatives, in Principles and Techniques of Algorithmic Differentiation, 2nd ed., Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2008] state transformations synthesizes the calculation of the Hessian as a formula. These dual points of view, graphical and algebraic, lead to a new framework for Hessian computation. This is illustrated by developing edge_pushing, a new truly reverse Hessian computation algorithm that fully exploits the Hessian's symmetry. Computational experiments compare the performance of edge_pushing on 16 functions from the CUTE collection [I. Bongartz et al. Cute: constrained and unconstrained testing environment, ACM Trans. Math. Softw. 21(1) (1995), pp. 123–160] against two algorithms available as drivers of the software ADOL-C [A. Griewank et al. ADOL-C: A package for the automatic differentiation of algorithms written in C/C++, Technical report, Institute of Scientific Computing, Technical University Dresden, 1999. Updated version of the paper published in ACM Trans. Math. Softw. 22, 1996, pp. 131–167; A. Walther, Computing sparse Hessians with automatic differentiation, ACM Trans. Math. Softw. 34(1) (2008), pp. 1–15; A.H. Gebremedhin et al. Efficient computation of sparse Hessians using coloring and automatic differentiation, INFORMS J. Comput. 21(2) (2009), pp. 209–223], and the results are very promising.
Acknowledgements
R.M. Gower was partially supported by CNPq-PRONEX Optimization and FAPESP (Grant 2006/53768-0). M.P. Mello was partially supported by CNPq and FAPESP (Grant 2009/04785-7).
Notes
Reverse in the sense that the order of evaluation is opposite to the order employed in calculating a function value.
The bandwidth of matrix M=(m ij ) is the maximum value of |i−j| such that m ij ≠0.