首页 /研究 /A Q-Learning approach to developing an automated neural computer player for the board game of CLUE&#x00AE;

LEARNING

A Q-Learning approach to developing an automated neural computer player for the board game of CLUE&#x00AE;

Silvio Ferrari

发表年份: 2008
引用次数: 8

摘要

The detective board game of CLUE reg can be viewed as a benchmark example of the treasure hunt problem, in which a sensor path is planned based on the expected value of information gathered from targets along the path. The sensor is viewed as an information gathering agent that makes imperfect measurements or observations from the targets, and uses them to infer one or more hidden variables (such as, target features or classification). The treasure hunt problem arises in many modern surveillance systems, such as demining and reconnaissance robotic sensors. Also, it arises in the board game of CLUE reg , where pawns must visit the rooms of a mansion to gather information from which the hidden cards can be inferred. In this paper, Q-learning is used to develop an automated neural computer player that plans the path of its pawn, makes suggestions about the hidden cards, and infers the answer, often winning the game. A neural network is trained to approximate the decision-value function representing the value of information, for which there exists no general closed-form representation. Bayesian inference, test (suggestions), and action (motion) decision making are unified using an MDP framework. The resulting computer player is shown to outperform other computer players implementing Bayesian networks, or constraint satisfaction.

关键词

TreasureComputer scienceArtificial intelligencePath (computing)Artificial neural networkValue (mathematics)Representation (politics)Machine learningTheoretical computer scienceProgramming language

A Q-Learning approach to developing an automated neural computer player for the board game of CLUE<sup>&#x00AE;</sup>

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory