In recent years, deep learning technologies growth rapidly and have widely applied in many artificial intelligence topics. The researchers gradually increase attention to question answering systems. For example, Stanford University constructed the SQuAD (Stanford Question Answering Dataset) dataset and has attracted the interest of many international natural language processing (NLP) research teams, such as Google and Microsoft teams. In biomedical NLP, the QA system also receives much attention. For example, in 2019, the Google research team released their own PubMed QA (Question-Answering) dataset and constructed an online platform to evaluate users' performances on the dataset. However, both Google's PubMed QA dataset and the International Biomedical Question Answering System competition, BioASQ (Biomedical Semantic Indexing and Question Answering) dataset, which is sponsored by the National Institute of Health, NIH) are from a collection of biomedical literature. To our acknowledgment, there is no available clinical QA dataset and system. The clinical NLP tasks are always a critical research area in NLP. For example, the NIH and American Mayo Clinic hospitals have been committed to participating in and promoting various clinical NLP tasks, like de-identification, risk factor detection, and family history extraction. In recent years, IBM has proposed a GAMENet (Graph Augmented Memory Networks) model to assist diagnosis by recommending the patient's medications for doctors. However, most of these studies are usually classification problems but do not include a clinical QA system. For example, the patient has diabetes symptoms. In addition to cardiovascular disease, what other complications may the patient have? What drugs can suppress the symptoms of diabetes and are less likely to cause complications of heart disease? Compared with the previous NLP tasks, the QA system in clinical literature is more challenging but can help doctors more immediately in diagnosis. In this proposal, we plan to develop a clinical QA dataset and system. Our QA corpus focuses on the patient's symptoms, diseases, complications, and medication recommendations, and will be used to assist doctors in writing the clinical record. This plan is expected to be a three-year plan. In the first year, we plan to use the QA generation model to construct the dataset. In the second year, the dataset from the previous year is used to develop deep learning technology to construct a Knowledge Graph of diseases and drugs, which will be used to predict the relations between symptoms, diseases, complications, and medication. In the last year, we plan to construct a QA system and collaborate with domestic doctors. Besides, the results of this project are expected to become the international standard benchmark for clinical QA.
|Effective start/end date||1/08/21 → 31/07/22|
UN Sustainable Development Goals
In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This project contributes towards the following SDG(s):
- Clinical Natural Language Processing
- Medication Recommendation
- Knowledge Graph
- Question Generation
- Complication Summarization
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.