A Q-Learning approach to developing an automated neural computer player for the board game of CLUE<sup>&#x00AE;</sup>
Silvio Ferrari
- 发表年份
- 2008
- 引用次数
- 8
摘要
The detective board game of CLUE <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">reg</sup> can be viewed as a benchmark example of the treasure hunt problem, in which a sensor path is planned based on the expected value of information gathered from targets along the path. The sensor is viewed as an information gathering agent that makes imperfect measurements or observations from the targets, and uses them to infer one or more hidden variables (such as, target features or classification). The treasure hunt problem arises in many modern surveillance systems, such as demining and reconnaissance robotic sensors. Also, it arises in the board game of CLUE <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">reg</sup> , where pawns must visit the rooms of a mansion to gather information from which the hidden cards can be inferred. In this paper, Q-learning is used to develop an automated neural computer player that plans the path of its pawn, makes suggestions about the hidden cards, and infers the answer, often winning the game. A neural network is trained to approximate the decision-value function representing the value of information, for which there exists no general closed-form representation. Bayesian inference, test (suggestions), and action (motion) decision making are unified using an MDP framework. The resulting computer player is shown to outperform other computer players implementing Bayesian networks, or constraint satisfaction.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002