Is Bang-Bang Control All You Need? Solving Continuous Control with\n Bernoulli Policies
Tim Seyde, Igor Gilitschenski, Wilko Schwarting, Bartolomeo Stellato, Martin Riedmiller, Markus Wulfmeier, Daniela Rus
- Year
- 2021
- Citations
- 15
- Access
- Open access
Abstract
Reinforcement learning (RL) for continuous control typically employs\ndistributions whose support covers the entire action space. In this work, we\ninvestigate the colloquially known phenomenon that trained agents often prefer\nactions at the boundaries of that space. We draw theoretical connections to the\nemergence of bang-bang behavior in optimal control, and provide extensive\nempirical evaluation across a variety of recent RL algorithms. We replace the\nnormal Gaussian by a Bernoulli distribution that solely considers the extremes\nalong each action dimension - a bang-bang controller. Surprisingly, this\nachieves state-of-the-art performance on several continuous control benchmarks\n- in contrast to robotic hardware, where energy and maintenance cost affect\ncontroller choices. Since exploration, learning,and the final solution are\nentangled in RL, we provide additional imitation learning experiments to reduce\nthe impact of exploration on our analysis. Finally, we show that our\nobservations generalize to environments that aim to model real-world challenges\nand evaluate factors to mitigate the emergence of bang-bang solutions. Our\nfindings emphasize challenges for benchmarking continuous control algorithms,\nparticularly in light of potential real-world applications.\n
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002