Abstract
The rapid increase of biomedical literature available on the web has made it increasingly difficult to find precise information. To implement an accurate biomedical information retrieval (IR) system, we must deal with the variants of biomedical terms carefully. In this paper, we focus on the generation of aliases, synonyms, acronyms, and lexical variants of such terms. In addition, we also propose a hyphen handling technique for processing hyphenated terms. We use the original terms/phrases, and expanded terms/phrases to construct an Indri query, and evaluate the effectiveness of various methods by two indicators: MAP, and recall. Our experiment results show that tackling hyphenation improves information retrieval significantly. In addition, synonym expansion also enhances IR performance when the focus of a query is identified. For a natural language query, deep semantic analysis and more knowledge-oriented expansion should be applied.
Original language | English |
---|---|
Journal | NIST Special Publication |
State | Published - 2005 |
Event | 14th Text REtrieval Conference, TREC 2005 - Gaithersburg, MD, United States Duration: 15 Nov 2005 → 18 Nov 2005 |
Keywords
- Biomedical literature
- Information retrieval
- Lexical variation
- Query expansion