Monocular Simultaneous Localisation and Mapping

Ethan Eade

Year: 2008
Citations: 23

Abstract

Simultaneous localisation and mapping is the task of estimating from sensor observations both motion and structure in an unknown environment. Performing SLAMwith a single video camera, while an attractive prospect, adds its own particular difficulties to the already considerable general challenges of the problem. This thesis advances the state of the art in monocular SLAM in terms of efficiency, richness of scene description, statistical correctness, and robustness. First, a SLAM algorithm from the robotics literature, designed to permit efficient operation with complex maps, is adapted to the monocular setting. A method for efficiently and correctly adding landmarks to the map is presented. The implemented SLAM system accurately maps thousands of landmarks in real time, giving an orderof-magnitude performance improvement over previous methods. Next, the system is extended to allow incorporation of edge landmarks as well as points. Edgelet landmarks and their representation are defined, and a method is described for reliably tracking edgelets, even in the presence of measurement ambiguity. An efficient selection algorithm for acquiring new edgelets from video allows the system to quickly extend the map. The working system produces geometrically accurate and meaningful edge maps at frame rate. With a focus on preserving statistical consistency during estimation, a novel monocular SLAM algorithm is presented. Estimation proceeds on a graph of local maps, partitioning and coalescing the observations taken from video. Careful parameterisation keeps local maps consistent, while optimisation of the connecting graph structure aids global convergence. The system can handle thousands of landmarks at frame rate, while delivering statistical performance superior to existing methods. Finally, this thesis mitigates the problems of tracking failure and large-scale localisation with a unified framework for loop closing and recovery. A hierarchical method is presented for finding correspondences between new video images and the existing map, using local and global appearance models and structure estimates. The framework is instantiated within the graph-based monocular SLAM system. The extended implementation continues mapping despite repeated tracking failures, successfully joining maps and closing loops in real time.

Keywords

Simultaneous localization and mappingArtificial intelligenceComputer visionComputer scienceRobustness (evolution)CorrectnessMonocularGraphRoboticsRobot

Monocular Simultaneous Localisation and Mapping

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory