TY - JOUR
T1 - Deep Learning for Human Action Recognition
T2 - A Comprehensive Review
AU - Vu, Duc Quang
AU - Thu, Trang Phung Thi
AU - Le, Ngan
AU - Wang, Jia Ching
N1 - Publisher Copyright:
© 2023 Cambridge University Press. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Over the past several years, we have witnessed remarkable progress in numerous computer vision applications, particularly in human activity analysis. Human action recognition, which aims to automatically examine and recognize the actions taking place in the video, has been widely applied in many applications. This paper presents a comprehensive survey of approaches and techniques in deep learning-based human activity analysis. First, we introduce the problem definition in action recognition together with its challenges. Second, we provide a comprehensive survey of feature representation methods. Third, we categorize human activity methodologies and discuss their advantages and limitations. In particular, we divide human action recognition into three main categories according to training mechanisms, i.e., supervised learning, semi-supervised learning, and self-supervised learning. We further analyze the existing network architectures, their performance, and source code availability for each main category. Fourth, we provide a detailed analysis of the existing, publicly available datasets, including small-scale and large-scale datasets for human action recognition. Finally, we discuss some open issues and future research directions.
AB - Over the past several years, we have witnessed remarkable progress in numerous computer vision applications, particularly in human activity analysis. Human action recognition, which aims to automatically examine and recognize the actions taking place in the video, has been widely applied in many applications. This paper presents a comprehensive survey of approaches and techniques in deep learning-based human activity analysis. First, we introduce the problem definition in action recognition together with its challenges. Second, we provide a comprehensive survey of feature representation methods. Third, we categorize human activity methodologies and discuss their advantages and limitations. In particular, we divide human action recognition into three main categories according to training mechanisms, i.e., supervised learning, semi-supervised learning, and self-supervised learning. We further analyze the existing network architectures, their performance, and source code availability for each main category. Fourth, we provide a detailed analysis of the existing, publicly available datasets, including small-scale and large-scale datasets for human action recognition. Finally, we discuss some open issues and future research directions.
KW - Action recognition
KW - deep learning
KW - deep neural networks
KW - self-supervised learning
KW - supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85159229667&partnerID=8YFLogxK
U2 - 10.1561/116.00000068
DO - 10.1561/116.00000068
M3 - 期刊論文
AN - SCOPUS:85159229667
SN - 2048-7703
VL - 12
JO - APSIPA Transactions on Signal and Information Processing
JF - APSIPA Transactions on Signal and Information Processing
IS - 1
M1 - e12
ER -