Depth Estimation of Video Sequences With Perceptual Losses

Anjie Wang, Zhijun Fang, Yongbin Gao, Xiaoyan Jiang, Siwei Ma

发表年份: 2018
引用次数: 24

摘要

3-D vision plays an important role in intelligent perception of robot, while it requires extra 3-D sensors. Depth estimation from monocular videos provides an alternative mechanism to recover the 3-D information. In this paper, we propose an unsupervised learning framework that uses the perceptual loss for depth estimation. Depth and pose networks are first trained to estimate the depth and the camera motion of the video sequence, respectively. With the estimated depth and pose of the original frame, the adjacent frame can be reconstructed. The pixel-wise differences between the constructed frame and the original frame are used as per-pixel loss. Meanwhile, reconstructed views and original views can be used to extract advanced features from a pre-trained network to define and optimize perceptual loss functions to assess the quality of reconstructions. We combine the respective advantages of these two methods and present an approach of generating a depth map by training the feed-forward network with per-pixel loss function and perceptual loss function. The experimental results show that our method can significantly improve the estimation accuracy of depth map.

关键词

Artificial intelligenceComputer scienceComputer visionFrame (networking)MonocularDepth mapPixelMotion estimationPerceptionDepth perception

Depth Estimation of Video Sequences With Perceptual Losses

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory