Dynamic hand gesture recognition using 3DCNN and LSTM with FSM context-aware model

Noorkholis Luthfil Hakim, Timothy K. Shih, Sandeli Priyanwada Kasthuri Arachchi, Wisnu Aditya, Yi Cheng Chen, Chih Yang Lin

Research output: Contribution to journalArticlepeer-review

47 Scopus citations


With the recent growth of Smart TV technology, the demand for unique and beneficial applications motivates the study of a unique gesture-based system for a smart TV-like environment. Combining movie recommendation, social media platform, call a friend application, weather updates, chatting app, and tourism platform into a single system regulated by natural-like gesture controller is proposed to allow the ease of use and natural interaction. Gesture recognition problem solving was designed through 24 gestures of 13 static and 11 dynamic gestures that suit to the environment. Dataset of a sequence of RGB and depth images were collected, preprocessed, and trained in the proposed deep learning architecture. Combination of three-dimensional Convolutional Neural Network (3DCNN) followed by Long Short-Term Memory (LSTM) model was used to extract the spatio-temporal features. At the end of the classification, Finite State Machine (FSM) communicates the model to control the class decision results based on application context. The result suggested the combination data of depth and RGB to hold 97.8% of accuracy rate on eight selected gestures, while the FSM has improved the recognition rate from 89% to 91% in a real-time performance.

Original languageEnglish
Article number5429
JournalSensors (Switzerland)
Issue number24
StatePublished - 2 Dec 2019


  • Context-aware
  • Deep learning
  • Hand gesture recognition
  • Multimodal system


Dive into the research topics of 'Dynamic hand gesture recognition using 3DCNN and LSTM with FSM context-aware model'. Together they form a unique fingerprint.

Cite this