Past, Present, and Future for Missing Value Imputation(1/3)

Project Details

Description

Incomplete datasets are usually caused by missing values. That is, some attribute value(s) of the data samples are missing. The missing value problem occurs due to problems such as manual data entry procedures, incorrect measurements, equipment errors, and so on. As a result, this kind of incomplete datasets can lead to performance degradation for the data mining purpose. To solve this problem, the case deletion and missing value imputation can be considered. In this three-year project, the aim of the first year research is to review and survey related works of missing value imputation from 2000 to 2015 in order to figure out the limitations of related literatures. On the other hand, the applicability of using case deletion is also examined. That is, different types missing data (i.e. categorical, numerical, and mixed types) and different missing rates are studied. The second year research focuses on comparing statistical and supervised learning techniques for missing value imputation. In particular, six different algorithms will be compared. Finally, the aim of the third year research is to propose a hybrid learning based imputation method to improve the quality of missing value imputation.
StatusFinished
Effective start/end date1/08/1631/07/17

UN Sustainable Development Goals

In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This project contributes towards the following SDG(s):

  • SDG 12 - Responsible Consumption and Production
  • SDG 17 - Partnerships for the Goals

Keywords

  • missing value imputation
  • data pre-processing
  • data mining
  • supervised learning algorithms

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.