Abstract
A new model for controlled sensing for multihypothesis testing is proposed and studied in the sequential setting. This new model, termed a controlled Markovian observation model, exhibits a more complicated memory structure in the controlled observations than existing models. In addition, instead of penalizing just the delay until the final decision time as in standard sequential hypothesis testing problems, a much more general cost structure is considered that entails accumulating the total control cost with respect to an arbitrary control cost function. An asymptotically optimal test is proposed for this new model and is shown to satisfy an optimality condition formulated in terms of decision-making risk. It is shown that the optimal causal control policy for the controlled sensing problem is self-tuning, in the sense of maximizing an inherent “inferential” reward simultaneously under every hypothesis, with the maximal value being the best possible corresponding to the case where the true hypothesis is known at the outset. Another test is also proposed to meet distinctly predefined constraints on the various decision risks nonasymptotically, while retaining asymptotic optimality.
ACKNOWLEDGMENTS
We thank the associate editor and the referees for carefully reading the article and providing useful and critical feedback. We also thank Professors P. R. Kumar and Ya Jun Mei for their pointers to Kumar and Becker (Citation1982) and Kiefer and Sacks (Citation1963), respectively.
Notes
Recommended by Alexander Tartakovsky