運用響應式知識蒸餾機制增進中文多標籤文本分類效能

Szu Chi Huang, Cheng Fu Cao, Po Hsun Liao, Lung Hao Lee, Po Lei Lee, Kuo Kai Shyu

研究成果: 書貢獻/報告類型會議論文篇章同行評審

摘要

It's difficult to optimize individual label performance of multi-label text classification, especially in those imbalanced data containing long-tailed labels. Therefore, this study proposes a response-based knowledge distillation mechanism comprising a teacher model that optimizes binary classifiers of the corresponding labels and a student model that is a standalone multi-label classifier learning from distilled knowledge passed by the teacher model. A total of 2,724 Chinese healthcare texts were collected and manually annotated across nine defined labels, resulting in 8731 labels, each containing an average of 3.2 labels. We used 5-fold cross-validation to compare the performance of several multi-label models, including TextRNN, TextCNN, HAN, and GRU-att. Experimental results indicate that using the proposed knowledge distillation mechanism effectively improved the performance no matter which model was used, about 2-3% of micro-F1, 4-6% of macro-F1, 3-4% of weighted-F1 and 1-2% of subset accuracy for performance enhancement.

貢獻的翻譯標題Enhancing Chinese Multi-Label Text Classification Performance with Response-based Knowledge Distillation
原文繁體中文
主出版物標題ROCLING 2022 - Proceedings of the 34th Conference on Computational Linguistics and Speech Processing
編輯Yung-Chun Chang, Yi-Chin Huang, Jheng-Long Wu, Ming-Hsiang Su, Hen-Hsen Huang, Yi-Fen Liu, Lung-Hao Lee, Chin-Hung Chou, Yuan-Fu Liao
發行者The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
頁面25-31
頁數7
ISBN(電子)9789869576956
出版狀態已出版 - 2022
事件34th Conference on Computational Linguistics and Speech Processing, ROCLING 2022 - Taipei, Taiwan
持續時間: 21 11月 202222 11月 2022

出版系列

名字ROCLING 2022 - Proceedings of the 34th Conference on Computational Linguistics and Speech Processing

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???34th Conference on Computational Linguistics and Speech Processing, ROCLING 2022
國家/地區Taiwan
城市Taipei
期間21/11/2222/11/22

Keywords

  • Multi-label classification
  • binary relevance
  • knowledge distillation
  • long-tailed labels

指紋

深入研究「運用響應式知識蒸餾機制增進中文多標籤文本分類效能」主題。共同形成了獨特的指紋。

引用此