Deep learning based text detection using resnet for feature extraction

Li Kun Huang, Hsiao Ting Tseng, Chen Chiung Hsieh, Chih Sin Yang

研究成果: 雜誌貢獻期刊論文同行評審


Popular deep learning models for text segmentation include CTPN, EAST, and PixelLink. However, they are not very well capable of dealing with the images containing densely distributed characters, and those characters may be connected. For these problems, the ResNet with excellent sensitivity for feature extraction is used to replace those embedded convolution neural networks in the main structures of CTPN and EAST. The experimental results showed that a better feature extraction network could significantly improve the precision of text localization. Noteworthy, the results indicate that the accuracy of modified EAST with ResNet101 would be the highest with a deeper depth and larger width of ResNet. The accuracy of text segmentation on ICDAR 2015 is 83.4% which is 7% higher than the original PVANET-EAST. The text detection accuracy is 83.9% on the untrained scanned document. Also, it achieved an accuracy of 86.3% when applied to self-collected Chinese calligraphy. Those results demonstrated that text detection using ResNet is a better improvement for OCR applications.

期刊Multimedia Tools and Applications
出版狀態已被接受 - 2023


深入研究「Deep learning based text detection using resnet for feature extraction」主題。共同形成了獨特的指紋。