Sensor fusion for semantic segmentation of urban scenes

Richard Zhang, Stefan A. Candra, K. Vetter, Avideh Zakhor

Year: 2015
Citations: 113

Abstract

Semantic understanding of environments is an important problem in robotics in general and intelligent autonomous systems in particular. In this paper, we propose a semantic segmentation algorithm which effectively fuses information from images and 3D point clouds. The proposed method incorporates information from multiple scales in an intuitive and effective manner. A late-fusion architecture is proposed to maximally leverage the training data in each modality. Finally, a pairwise Conditional Random Field (CRF) is used as a post-processing step to enforce spatial consistency in the structured prediction. The proposed algorithm is evaluated on the publicly available KITTI dataset [1] [2], augmented with additional pixel and point-wise semantic labels for building, sky, road, vegetation, sidewalk, car, pedestrian, cyclist, sign/pole, and fence regions. A per-pixel accuracy of 89.3% and average class accuracy of 65.4% is achieved, well above current state-of-the-art [3].

Keywords

Leverage (statistics)Conditional random fieldComputer scienceArtificial intelligencePoint cloudSegmentationPixelComputer visionPairwise comparisonConsistency (knowledge bases)

Sensor fusion for semantic segmentation of urban scenes

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory