Abstract
In this paper an algorithm for machine learning is defined and justified heuristically and empirically. “Consistency” properties of perfect evaluation functions are derived and these are used to select the best of a family of evaluation functions, that is, that evaluation function that is most consistent. Methods peculiar to a particular game such as rote learning, looking ahead are generally eschewed; “learning” consists of finding the evaluation which is most consistent. In the game of Nim, in which the winning strategy is well-known, we show that these principles are sufficient to derive a perfect evaluation function (under appropriate conditions) and so arrive at a winning behavioral strategy. In the Mod(6) game, for which a winning strategy is also known, we use the algorithm to deduce an evaluation function and evaluate its effectiveness. Finally, in the game of Hex we match the algorithm with a random player and observe its success.
† The work in this paper was partially supported by National Sciecne Foundation.
† The work in this paper was partially supported by National Sciecne Foundation.
Notes
† The work in this paper was partially supported by National Sciecne Foundation.