Home /Research /FANet: Feature Aggregation Network for Semantic Segmentation
PERCEPTION

FANet: Feature Aggregation Network for Semantic Segmentation

Tanmay Singha, Duc-Son Pham, Aneesh Krishna

Year
2020
Citations
12

Abstract

Due to the rapid development in robotics and autonomous industries, optimization and accuracy have become an important factor in the field of computer vision. It becomes a challenging task for the researchers to design an efficient, optimized model with high accuracy in the field of object detection and semantic segmentation. Some existing off-line scene segmentation methods have shown an outstanding result on different datasets at the cost of a large number of parameters and operations, whereas some well-known real-time semantic segmentation techniques have reduced the number of parameters and operations in demand for resource-constrained applications, but model accuracy is compromised. We propose a novel approach for scene segmentation suitable for resource-constrained embedded devices by keeping a right balance between model architecture and model performance. Exploiting the multi-scale feature fusion technique with accurate localization augmentation, we introduce a fast feature aggregation network, a real-time scene segmentation model capable of handling high-resolution input image (1024 × 2048 px). Relying on an efficient embedded vision backbone network, our feature pyramid network outperforms many existing off-line and real-time pixel-wise deep convolution neural networks (CNNs) and produces 89.7% pixel accuracy and 65.9% mean intersection over union (mIoU) on the Cityscapes benchmark validation dataset whilst having only 1.1M parameters and 5.8B FLOPS.

Keywords

Computer scienceArtificial intelligenceSegmentationFeature (linguistics)Backbone networkBenchmark (surveying)Pyramid (geometry)Intersection (aeronautics)Convolutional neural networkObject detection

Related papers

Browse all PERCEPTION papers