A hybrid approach to biomedical named entity recognition and semantic role labeling

研究成果: 會議貢獻類型會議論文同行評審

5 引文 斯高帕斯(Scopus)

摘要

In this paper, we describe our hybrid approach to two key NLP technologies: biomedical named entity recognition (Bio-NER) and (Bio-SRL). In Bio-NER, our system successfully integrates linguistic features into the CRF framework. In addition, we employ web lexicons and template-based post-processing to further boost its performance. Through these broad linguistic features and the nature of CRF, our system outperforms state-of-the-art machine-learning-based systems, especially in the recognition of protein names (F=78.5%). In Bio-SRL, first, we construct a proposition bank on top of the popular biomedical GENIA treebank following the PropBank annotation scheme. We only annotate the predicate-argument structures (PAS's) of thirty frequently used biomedical verbs (predicates) and their corresponding arguments. Second, we use our proposition bank to train a biomedical SRL system, which uses a maximum entropy (ME) machine-learning model. Thirdly, we automatically generate argument-type templates, which can be used to improve classification of biomedical argument roles. Our experimental results show that a newswire English SRL system that achieves an F-score of 86.29% in the newswire English domain can maintain an F-score of 64.64% when ported to the biomedical domain. By using our annotated biomedical corpus, we can increase that F-score by 22.9%. Adding automatically generated template features further increases overall F-score by 0.47% and adjunct (AM) F-score by 1.57%, respectively.

原文???core.languages.en_GB???
頁面243-246
頁數4
出版狀態已出版 - 2006
事件2006 Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, HLT-NAACL 2006 - New York City, United States
持續時間: 4 6月 20069 6月 2006

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???2006 Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, HLT-NAACL 2006
國家/地區United States
城市New York City
期間4/06/069/06/06

指紋

深入研究「A hybrid approach to biomedical named entity recognition and semantic role labeling」主題。共同形成了獨特的指紋。

引用此