Abstract
We consider the Hamiltonian cycle problem (HCP) embedded in a singularly perturbed Markov decision process (MDP). More specifically, we consider the HCP as an optimization problem over the space of long-run state-action frequencies induced by the MDP's stationary policies. We also consider two quadratic functionals over the same space. We show that when the perturbation parameter, ϵ is sufficiently small the Hamiltonian cycles of the given directed graph are precisely the maximizers of one of these quadratic functionals over the frequency space intersected with an appropriate (single) contour of the second quadratic functional. In particular, all these maximizers have a known Euclidean distance of z m (ϵ) from the origin. Geometrically, this means that Hamiltonian cycles, if any, are the points in the frequency polytope where the circle of radius z m (ϵ) intersects a certain ellipsoid.
Acknowledgments
This research was supported in part by the ARC Discovery Grant Nos. A00000767 and DP0343028. We are also indebted to an anonymous referee for many helpful suggestions.
Notes
1The name of the problem owes of the fact that Sir William Hamilton investigated the existence of such cycles on the dodecahedron graph [Citation12].
2“Knight's tour puzzle”: start at any square of the 8 × 8 chessboard visit every other square once and only once and return to the starting square.
3We are indebted to an anonymous referee for encouraging us to explicitly derive the results reported in this Appendix.