Automatic Punctuation Restoration for corpus in Traditional Chinese Language using Deep Learning

Yu Chieh Chao, Chia Hui Chang

研究成果: 書貢獻/報告類型會議論文篇章同行評審

摘要

The Automatic Speech Recognition (ASR) technique has already been applied to several chat apps, allowing people to orally input messages instead of typing words by hand. Meanwhile, ASR techniques have also been used in the transcription of meeting minutes from audio records. However, there exist two main reasons such that ASR systems are not suitable for some formal situations: wrong words caused by erroneous recognition and lacking punctuation marks, which degrade the readability and might express wrong meaning. In our work, we expect to set up a model to automatically restore punctuation marks for the corpus generated by ASR systems; however, since lacking such labeled data for our ASR corpus, we train and test our model totally on the corresponding transcript data. This research focuses on automatic punctuation restoration for traditional Chinese language corpus using neural network model. Our results show that the bidirectional Gated Recurrent Unit (GRU) with attention mechanism outperforms other models on our punctuation restoration task when the amount of the training data is limited.

原文???core.languages.en_GB???
主出版物標題Proceedings - 25th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2020
發行者Institute of Electrical and Electronics Engineers Inc.
頁面91-96
頁數6
ISBN(電子)9781665403801
DOIs
出版狀態已出版 - 12月 2020
事件25th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2020 - Taipei, Taiwan
持續時間: 3 12月 20205 12月 2020

出版系列

名字Proceedings - 25th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2020

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???25th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2020
國家/地區Taiwan
城市Taipei
期間3/12/205/12/20

指紋

深入研究「Automatic Punctuation Restoration for corpus in Traditional Chinese Language using Deep Learning」主題。共同形成了獨特的指紋。

引用此