A Pattern-Based Framework for Addressing Data Representational Inconsistency

Bingyu Yi, Wen Hua, Shazia Sadiq

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

2 Citations (Scopus)

Abstract

Data representational inconsistency, where data has diverse formats or structures, is a crucial data quality problem. Existing fixing approaches either target on a specific domain or require massive information from users. In this work, we propose a user-friendly pattern-based framework for addressing data representational inconsistency. Our framework consists of three modules: pattern design, pattern detection, and pattern unification.We identify several challenges in all the three tasks in order to handle an inconsistent dataset both accurately and efficiently. We propose various techniques to tackle these issues, and our experimental results on real-life datasets demonstrate better performance of our proposals compared with existing methods.

Original languageEnglish
Title of host publicationDatabases Theory and Applications - 27th Australasian Database Conference, ADC 2016, Proceedings
EditorsMuhammad Aamir Cheema, Wenjie Zhang, Lijun Chang
PublisherSpringer Verlag
Pages395-406
Number of pages12
ISBN (Print)9783319469218
DOIs
Publication statusPublished - 2016
Externally publishedYes
Event27th Australasian Database Conference on Databases Theory and Applications, ADC 2016 - Sydney, United States
Duration: 28 Sept 201629 Sept 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9877 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th Australasian Database Conference on Databases Theory and Applications, ADC 2016
Country/TerritoryUnited States
CitySydney
Period28/09/1629/09/16

ASJC Scopus subject areas

  • Information Systems

Fingerprint

Dive into the research topics of 'A Pattern-Based Framework for Addressing Data Representational Inconsistency'. Together they form a unique fingerprint.

Cite this