The UAVid Dataset for Video Semantic Segmentation

Ye Lyu, George Vosselman, Gui-Song Xia, Alper Yılmaz, Michael Ying Yang

Year: 2018
Citations: 10
Access: Open access

Abstract

Video semantic segmentation has been one of the research focus in computer vision recently. It serves as a perception foundation for many fields such as robotics and autonomous driving. The fast development of semantic segmentation attributes enormously to the large scale datasets, especially for the deep learning related methods. Currently, there already exist several semantic segmentation datasets for complex urban scenes, such as the Cityscapes and CamVid datasets. They have been the standard datasets for comparison among semantic segmentation methods. In this paper, we introduce a new high resolution UAV video semantic segmentation dataset as complement, UAVid. Our UAV dataset consists of 30 video sequences capturing high resolution images. In total, 300 images have been densely labelled with 8 classes for urban scene understanding task. Our dataset brings out new challenges. We provide several deep learning baseline methods, among which the proposed novel Multi-Scale-Dilation net performs the best via multi-scale feature extraction. We have also explored the usability of sequence data by leveraging on CRF model in both spatial and temporal domain.

Keywords

Computer scienceSegmentationArtificial intelligenceDeep learningFocus (optics)Semantics (computer science)Scale (ratio)Pattern recognition (psychology)Computer visionMachine learning

The UAVid Dataset for Video Semantic Segmentation

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory