Formal and scalable multi-robot coordination methods for long horizon tasks with time uncertainty
Carlos Azevedo, Pedro U. Lima, Bruno Lacerda, Nick Hawes
- 发表年份
- 2025
- 引用次数
- 1
摘要
Many real-world robotic applications, such as monitoring, inspection, and surveillance tasks, require effective multi-robot coordination over extended time horizons. These scenarios benefit from long-term planning and execution, and the ability to handle time uncertainty a priori significantly enhances efficiency in unpredictable environments. In this work, we introduce and compare two approaches for synthesizing coordination policies for multi-robot systems that account for time uncertainty and optimize performance over an infinite horizon. Both approaches are based on reasoning over a generalized stochastic Petri net with rewards (GSPNR) model and optimize the average reward criterion. The first approach is an exact method that provides formal guarantees on the synthesized policies and ensures convergence to the optimal policy. We evaluate this method in a solar farm inspection scenario, comparing its performance to discounted reward optimization methods and a carefully designed hand-crafted policy. The results demonstrate that, over the long term, the exact method outperforms these alternatives. However, its scalability is limited, as it cannot handle large state spaces. To address this limitation, we propose a second approach that uses an actor-critic deep reinforcement learning algorithm. This method learns policies directly within the GSPNR formalism and optimizes for the average reward criterion. We assess its performance in the same solar farm inspection scenario, and the results show that it outperforms proximal policy optimization methods. Moreover, it is capable of finding near-optimal solutions in models with state spaces five orders of magnitude larger than those manageable by the exact method. • This work addresses multi-robot coordination challenges under uncertainty and infinite horizons, with applications in environmental monitoring, infrastructure inspection, and security surveillance. • It introduces two complementary approaches grounded in the generalized stochastic Petri net with rewards formalism, a structured and expressive framework for representing and solving multi-robot coordination problems with time uncertainty. • The first approach is an exact method that computes optimal policies with formal guarantees, making it well-suited for small-scale problems, though its computational complexity limits its scalability to larger state spaces. • The second approach is a learning-based method that efficiently finds solutions to problems four orders of magnitude larger than those manageable by the exact method. It also learns approximate solutions that generalize effectively to unseen parts of the state space, enabling robust execution without retraining. • The efficacy of these approaches is demonstrated in a solar farm inspection scenario, where they exhibit superior performance and robustness compared to state-of-the-art methods.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002