Integration of gene normalization stages and co-reference resolution using a Markov logic network

Hong Jie Dai, Yen Ching Chang, Richard Tzong Han Tsai, Wen Lian Hsu

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

Motivation: Gene normalization (GN) is the task of normalizing a textual gene mention to a unique gene database ID. Traditional top performing GN systems usually need to consider several constraints to make decisions in the normalization process, including filtering out false positives, or disambiguating an ambiguous gene mention, to improve system performance. However, these constraints are usually executed in several separate stages and cannot use each other's input/output interactively. In this article, we propose a novel approach that employs a Markov logic network (MLN) to model the constraints used in the GN task. Firstly, we show how various constraints can be formulated and combined in an MLN. Secondly, we are the first to apply the two main concepts of co-reference resolution-discourse salience in centering theory and transitivity- to GN models. Furthermore, to make our results more relevant to developers of information extraction applications, we adopt the instance-based precision/recall/F-measure (PRF) in addition to the article-wide PRF to assess system performance. Results: Experimental results show that our system outperforms baseline and state-of-the-art systems under two evaluation schemes. Through further analysis, we have found several unexplored challenges in the GN task.

Original languageEnglish
Article numberbtr358
Pages (from-to)2586-2594
Number of pages9
JournalBioinformatics
Volume27
Issue number18
DOIs
StatePublished - Sep 2011

Fingerprint

Dive into the research topics of 'Integration of gene normalization stages and co-reference resolution using a Markov logic network'. Together they form a unique fingerprint.

Cite this