TY - GEN
T1 - A Pixel-Based Character Detection Scheme for Texts with Arbitrary Orientations in Natural Scenes
AU - Chen, Li Zhu
AU - Su, Po Chyi
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In recent years, there has been a significant focus on deep learning-based research for detecting texts in natural scenes. While many studies have achieved promising results by targeting word detection, challenges remain in detecting and recognizing texts with arbitrary orientations. Complex image backgrounds, text occlusion, and variations in text styles easily affect the detection process of words. This paper introduces a pixel-based character detection scheme for extracting individual characters within words. The objective is to locate characters in irregular text orientations or shapes, thereby achieving better alignment of detection bounding boxes with character edges. Since existing datasets only provide word-level annotations and lack character-level ground truths, we generate realistically synthesized artificial data to address this limitation. We employ weakly supervised learning, utilizing partially annotated data for training, and subsequently enhance performance by incorporating actual data. Experimental results demonstrate that our scheme outperforms other character-level detection models regarding text recognition accuracy, as evidenced by comparisons on datasets such as ICDAR2017, TotalText, and CTW-1500.
AB - In recent years, there has been a significant focus on deep learning-based research for detecting texts in natural scenes. While many studies have achieved promising results by targeting word detection, challenges remain in detecting and recognizing texts with arbitrary orientations. Complex image backgrounds, text occlusion, and variations in text styles easily affect the detection process of words. This paper introduces a pixel-based character detection scheme for extracting individual characters within words. The objective is to locate characters in irregular text orientations or shapes, thereby achieving better alignment of detection bounding boxes with character edges. Since existing datasets only provide word-level annotations and lack character-level ground truths, we generate realistically synthesized artificial data to address this limitation. We employ weakly supervised learning, utilizing partially annotated data for training, and subsequently enhance performance by incorporating actual data. Experimental results demonstrate that our scheme outperforms other character-level detection models regarding text recognition accuracy, as evidenced by comparisons on datasets such as ICDAR2017, TotalText, and CTW-1500.
UR - http://www.scopus.com/inward/record.url?scp=85179754298&partnerID=8YFLogxK
U2 - 10.1109/GCCE59613.2023.10315569
DO - 10.1109/GCCE59613.2023.10315569
M3 - 會議論文篇章
AN - SCOPUS:85179754298
T3 - GCCE 2023 - 2023 IEEE 12th Global Conference on Consumer Electronics
SP - 961
EP - 962
BT - GCCE 2023 - 2023 IEEE 12th Global Conference on Consumer Electronics
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 12th IEEE Global Conference on Consumer Electronics, GCCE 2023
Y2 - 10 October 2023 through 13 October 2023
ER -