首页 /研究 /Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation
OTHER

Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation

John Martin, Jinkun Wang, Brendan Englot

发表年份
2018
访问权限
开放获取

摘要

We present a method for Temporal Difference (TD) learning that addresses several challenges faced by robots learning to navigate in a marine environment. For improved data efficiency, our method reduces TD updates to Gaussian Process regression. To make predictions amenable to online settings, we introduce a sparse approximation with improved quality over current rejection-based sparse methods. We derive the predictive value function posterior and use the moments to obtain a new algorithm for model-free policy evaluation, SPGP-SARSA. With simple changes, we show SPGP-SARSA can be reduced to a model-based equivalent, SPGP-TD. We perform comprehensive simulation studies and also conduct physical learning trials with an underwater robot. Our results show SPGP-SARSA can outperform the state-of-the-art sparse method, replicate the prediction quality of its exact counterpart, and be applied to solve underwater navigation tasks.

关键词

cs.LGstat.ML

相关论文

查看 OTHER 分类全部论文