Selecting Suitable Data Input for Deep-Learning Sign-Language Recognition with a Small Dataset

Yu Jen Chen, Po Chyi Su

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Deep learning-based sign-language recognition usually requires abundant training videos. This research considers generating valid sign-language data for training the recognition models. MediaPipe is used to acquire the hand skeleton from the sign-language video. Then we analyze several hand skeleton adjustment policies with color-weighting strategies and generate hand masks to simulate different hands. Since miss detections of hands may happen due to motion blurring caused by rapid hand movements, we incorporate optical flows to ensure that the hand movement information is retained in each frame. We employ different spatial and temporal data augmentation strategies to simulate varying hand sizes, filming angles, and hand speeds. The experimental results show that the proposed method improves the accuracy of sign-language recognition in the American Sign Language dataset.

Original languageEnglish
Title of host publication2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages384-391
Number of pages8
ISBN (Electronic)9798350300673
DOIs
StatePublished - 2023
Event2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023 - Taipei, Taiwan
Duration: 31 Oct 20233 Nov 2023

Publication series

Name2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023

Conference

Conference2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
Country/TerritoryTaiwan
CityTaipei
Period31/10/233/11/23

Fingerprint

Dive into the research topics of 'Selecting Suitable Data Input for Deep-Learning Sign-Language Recognition with a Small Dataset'. Together they form a unique fingerprint.

Cite this