Dynamic tracking attention model for action recognition

Chien Yao Wang, Chin Chin Chiang, Jian Jiun Ding, Jia Ching Wang

研究成果: 書貢獻/報告類型會議論文篇章同行評審

5 引文 斯高帕斯(Scopus)

摘要

This paper proposes a dynamic tracking attention model (DTAM), which mainly comprises a motion attention mechanism, a convolutional neural network (CNN) and long short-term memory (LSTM), to recognize human action in a video sequence. In the motion attention mechanism, the local dynamic tracking is used to track moving objects in feature domain and global dynamic tracking corrects the motion in the spectral domain. The CNN is utilized to perform feature extraction, while the LSTM is applied to handle sequential information about actions that is extracted from videos. It effectively fetches information between consecutive frames in a video sequence and has an even higher recognition rate than does the CNN-LSTM. Combining the DTAM with the visual attention model, the proposed algorithm has a recognition rate that is 3.6% and 4.5% higher than that of the CNN-LSTMs with and without the visual attention model, respectively.

原文???core.languages.en_GB???
主出版物標題2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings
發行者Institute of Electrical and Electronics Engineers Inc.
頁面1617-1621
頁數5
ISBN(電子)9781509041176
DOIs
出版狀態已出版 - 16 6月 2017
事件2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - New Orleans, United States
持續時間: 5 3月 20179 3月 2017

出版系列

名字ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN(列印)1520-6149

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017
國家/地區United States
城市New Orleans
期間5/03/179/03/17

指紋

深入研究「Dynamic tracking attention model for action recognition」主題。共同形成了獨特的指紋。

引用此