Human Action Recognition Using Deep Probabilistic Graphical Models
Di Wu
- Year
- 2014
- Citations
- 4
- Access
- Open access
Abstract
Building intelligent systems that are capable of representing or extracting high-level representations from high-dimensional sensory data lies at the core of solving many A.I. related tasks. Human action recognition is an important topic in computer vision that lies in high-dimensional space. Its applications include robotics, video surveillance, human-computer interaction, user interface design, and multi-media video retrieval amongst others. \n \n \nA number of approaches have been proposed to extract representative features from high-dimensional temporal data, most commonly hard wired geometric or bio-inspired shape context features. \nThis thesis first demonstrates some \\emph{ad-hoc} hand-crafted rules for effectively encoding motion features, and later elicits a more generic approach for incorporating structured feature learning and reasoning, \\ie deep probabilistic graphical models. \n \nThe hierarchial dynamic framework first extracts high level features and then uses the learned representation for estimating emission probability to infer action sequences. \nWe show that better action recognition can be achieved by replacing gaussian mixture models by Deep Neural Networks that contain many layers of features to predict probability distributions over states of Markov Models. The framework can be easily extended to include an ergodic state to segment and recognise actions simultaneously. \n \nThe first part of the thesis focuses on analysis and applications of hand-crafted features for human action representation and classification. We show that the ``hard coded" concept of correlogram can incorporate correlations between time domain sequences and we further investigate multi-modal inputs, \\eg depth sensor input and its unique traits for action recognition. \n \nThe second part of this thesis focuses on marrying probabilistic graphical models with Deep Neural Networks (both Deep Belief Networks and Deep 3D Convolutional Neural Networks) for structured sequence prediction. The proposed Deep Dynamic Neural Network exhibits its general framework for structured 2D data representation and classification. This inspires us to further investigate for applying various graphical models for time-variant video sequences.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002