Leveraging spatio-temporal redundancy for RFID data cleansing

Haiquan Chen, Wei Shinn Ku, Haixun Wang, Min Te Sun

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

75 Scopus citations

Abstract

Radio Frequency Identification (RFID) technologies are used in many applications for data collection. However, raw RFID readings are usually of low quality and may contain many anomalies. An ideal solution for RFID data cleansing should address the following issues. First, in many applications, duplicate readings (by multiple readers simultaneously or by a single reader over a period of time) of the same object are very common. The solution should take advantage of the resulting data redundancy for data cleaning. Second, prior knowledge about the readers and the environment (e.g., prior data distribution, false negative rates of readers) may help improve data quality and remove data anomalies, and a desired solution must be able to quantify the degree of uncertainty based on such knowledge. Third, the solution should take advantage of given constraints in target applications (e.g., the number of objects in a same location cannot exceed a given value) to elevate the accuracy of data cleansing. There are a number of existing RFID data cleansing techniques. However, none of them support all the aforementioned features. In this paper we propose a Bayesian inference based approach for cleaning RFID raw data. Our approach takes full advantage of data redundancy. To capture the likelihood, we design an n-state detection model and formally prove that the 3-state model can maximize the system performance. Moreover, in order to sample from the posterior, we devise a Metropolis-Hastings sampler with Constraints (MH-C), which incorporates constraint management to clean RFID raw data with high efficiency and accuracy. We validate our solution with a common RFID application and demonstrate the advantages of our approach through extensive simulations.

Original languageEnglish
Title of host publicationProceedings of the 2010 International Conference on Management of Data, SIGMOD '10
Pages51-62
Number of pages12
DOIs
StatePublished - 2010
Event2010 International Conference on Management of Data, SIGMOD '10 - Indianapolis, IN, United States
Duration: 6 Jun 201011 Jun 2010

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference2010 International Conference on Management of Data, SIGMOD '10
Country/TerritoryUnited States
CityIndianapolis, IN
Period6/06/1011/06/10

Keywords

  • data cleaning
  • probabilistic algorithms
  • spatio-temporal databases
  • uncertainty

Fingerprint

Dive into the research topics of 'Leveraging spatio-temporal redundancy for RFID data cleansing'. Together they form a unique fingerprint.

Cite this