Skip to content
2000
Volume 10, Issue 1
  • ISSN: 2213-2759
  • E-ISSN: 1874-4796

Abstract

Background: Deep neural network based methods have obtained great progress in a variety of computer vision tasks, as described in various patents. But, so far, it is still a challenge task to model temporal dependencies in the tasks of recognizing object movement from videos. Method: In this paper, we propose a multi-timescale gated neural network for encoding the temporal dependencies from videos. The developed model stacks multiple gated layers in a recurrent pyramid, which makes it possible to hierarchically model not just pairs but long-term dependencies from video frames. Additionally, the model combines the Convolutional Neural Networks into its structure that exploits the pictorial nature of the frames and reduces the number of model parameters. Result: We evaluated the proposed model on the datasets of synthetic bouncing-MNIST, standard actions benchmark of UCF101 and facial expressions benchmark of CK+. The experiment results reveal that on all tasks, the proposed model outperforms the existing approach to build deep stacked gated model and achieves superior performance compared to several recent state-of-the-art techniques. Conclusion: From the experimental results, we can make the conclusion that our proposed model is able to adapt its structure based on different time scales and can be applied in motion estimation, action recognition and tracking, etc.

Loading

Article metrics loading...

/content/journals/cseng/10.2174/2213275910666170502144924
2017-02-01
2025-09-06
Loading full text...

Full text loading...

/content/journals/cseng/10.2174/2213275910666170502144924
Loading
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test