Background: Because of the increased adoption rate of electronic medical record (EMR) systems, more health care records have been increasingly accumulating in clinical data repositories. Therefore, querying the data stored in these repositories is crucial for retrieving the knowledge from such large volumes of clinical data. Objective: The aim of this study is to develop a Web-based approach for enriching the capabilities of the data-querying system along the three following considerations: (1) the interface design used for query formulation, (2) the representation of query results, and (3) the models used for formulating query criteria. Methods: The Guideline Interchange Format version 3.5 (GLIF3.5), an ontology-driven clinical guideline representation language, was used for formulating the query tasks based on the GLIF3.5 flowchart in the Protégé environment. The flowchart-based data-querying model (FBDQM) query execution engine was developed and implemented for executing queries and presenting the results through a visual and graphical interface. To examine a broad variety of patient data, the clinical data generator was implemented to automatically generate the clinical data in the repository, and the generated data, thereby, were employed to evaluate the system. The accuracy and time performance of the system for three medical query tasks relevant to liver cancer were evaluated based on the clinical data generator in the experiments with varying numbers of patients. Results: In this study, a prototype system was developed to test the feasibility of applying a methodology for building a query execution engine using FBDQMs by formulating query tasks using the existing GLIF. The FBDQM-based query execution engine was used to successfully retrieve the clinical data based on the query tasks formatted using the GLIF3.5 in the experiments with varying numbers of patients. The accuracy of the three queries (ie, "degree of liver damage," "degree of liver damage when applying a mutually exclusive setting," and "treatments for liver cancer") was 100% for all four experiments (10 patients, 100 patients, 1000 patients, and 10,000 patients). Among the three measured query phases, (1) structured query language operations, (2) criteria verification, and (3) other, the first two had the longest execution time. Conclusions: The ontology-driven FBDQM-based approach enriched the capabilities of the data-querying system. The adoption of the GLIF3.5 increased the potential for interoperability, shareability, and reusability of the query tasks.
- Clinical practice guideline
- Electronic medical records
- Information retrieval query processing
- Ontology engineering
- Query languages