Where Am I? Comparing CNN and LSTM for Location Classification in Egocentric Videos
Georgios Kapidis, Ronald Poppe, Elsbeth A. van Dam, Remco C. Veltkamp, L.P.J.J. Noldus
- Year
- 2018
- Citations
- 6
Abstract
Egocentric vision is a technology that exists in a variety of fields such as life-logging, sports recording and robot navigation. Plenty of research work focuses on location detection and activity recognition, with applications in the area of Ambient Assisted Living. The basis of this work is the idea that locations can be characterized by the presence of specific objects. Our objective is the recognition of locations in egocentric videos that mainly consist of indoor house scenes. We perform an extensive comparison between Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) based classification methods that aim at finding the in-house location by classifying the detected objects which are extracted with a state-of-the-art object detector. We show that location classification is affected by the quality of the detected objects, i.e., the false detections among the correct ones in a series of frames, but this effect can be greatly limited by taking into account the temporal structure of the information by using LSTM. Finally, we argue about the potential for useful real-world applications.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002